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ABSTRACT 

The mapping between the distributions of the observed galaxy stellar mass and the underlying 
dark matter halos provides the crucial link from theories of large-scale structure formation to 
interpreting the complex phenomena of galaxy formation and evolution. We develop a novel 
statistical method, based on the Halo Occupation Distribution model (HOD), to solve for 
this mapping by jointly fitting the galaxy clustering and the galaxy-galaxy lensing measured 
from the Sloan Digital Sky Survey (SDSS). The method, called the iHOD model, extracts 
maximum information from the survey by including ^ 80 % more galaxies than the traditional 
HOD methods, and takes into account the incompleteness of the stellar mass samples in a 
statistically consistent manner. The derived stellar-to-halo mass relation not only explains the 
clustering and lensing of SDSS galaxies over almost four decades in stellar mass, but also 
successfully predicts the stellar mass functions observed in SDSS. Due to its capability of 
modelling significantly more galaxies, the iHOD is able to break the degeneracy between the 
logarithmic scatter in the stellar mass at fixed halo mass and the slope of the stellar-to-halo 
mass relation at high mass end, without the need to assume a strong prior on the scatter and/or 
use the stellar mass function as an input. We detect a decline of the scatter with halo mass, 
from 0.22^Qg^ dex at below Mq to 0.18 ± 0.01 dex at lO^^/i^^M©. The model 

also enables stringent constraints on the satellite stellar mass functions at fixed halo mass, 
predicting a departure from the Schechter functional form in high mass halos. The iHOD 
model can be easily applied to existing and future spectroscopic datasets, greatly improving 
the statistical constraint on the stellar-to-halo mass relation compared to the traditional HOD 
methods within the same survey. 

Key words: cosmology: observations — cosmology: large-scale structure of Universe — 
galaxies: luminosity function, mass function — gravitational lensing: weak — methods: sta¬ 
tistical 


1 INTRODUCTION 


The distribution of galaxy stellar mass in the present-day Universe 
provides important clues to answering fundamental questions con¬ 
cerning the cosmic assembly of baryons in the A-cold dark mat- 


ter (ACDM) paradigm <|Fukugita et al. 

19981 IKeres et al.||2005l 

[Faucher-Giguere et al.|201 1| Dave et al. 

2012 1 . In particular, what 


fraction of baryons are condensed into stars as opposed to spread¬ 
ing out in the form of gas and dust in and outside of galaxies jCen 
& Ostriker|l^|McGaugh et al.|2010|[Putman et al.|2012[pi^ 


et al.|2012f ? Among those locked in stars, how much stellar mass is 

stored within the central galaxies of the dark matter halos ( |De Lu-| 
|cia & Blaizot|2007l|von der Linden et al.|2007| l, and how is the rest 
distributed among the satellite galaxies within their larger host ha¬ 
los ( [Hansen et al. [2009][Yang et al.|2009[[Leauthaud et al.|2012ap 
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In this paper we develop a novel statistical approach within the 
Halo Occupation Distribution (HOD; ]inge^alJ|1998jJMa&Fty 


2000[|Peacock & Smith|2000[|Seljak|2000[|Yang et al.|2003[|Scoc- 


cimarro et al.||200lf Cooray & Sheth||2002[' Berlind & Weinberg 


2002 Guzik_jrje(jakJ200^ Zl^ng et al.|2005[|Mandelbaum et al. 


2006 van den Bosch et al.|2007^ framework, to solve for the map 
ping between the stellar mass content and the dark matter halos us 
ing the spatial clustering and the weak gravitational lensing of the 
Sloan Digital Sky Survey (SDSS; [York et al.|2000| l spectroscopic 
sample galaxies. The inferred mapping not only explains the galaxy 
auto-correlation function (i.e., clustering) and the galaxy-matter 
cross-correlation (i.e., lensing) successfully, but also correctly pre¬ 
dicts the observed stellar mass function (SMF), placing strong con¬ 
straints on the physics that governs the formation and distribution 
of galaxies within halos. 

The most comprehensive way to link the galaxy properties 
to the dark matter halos is to directly model the expected phys- 
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ical processes involved in the formation and evolution of stars 
and gas (along with metals and dust), by either running hydrody- 


namic simulations dHernquist & Katz|1989| Katz et al.|1996 Nor- 

man & Bryan|1999| |Teyssier||2002 

O’Shea et al.||2004 Springel 

et al.|2005[|Di Matteo et al.pOOS 

Oppenheimer & Dave||20061 

Booth & Schaye|2009| Vogelsberger 

et al.|2013b or semi-analytical 


models (SAMs) along the halo merger trees of N-body simula¬ 
tions JBaugh]200^ Somerville et al.|2008[|Kang et al.|2005[|Bower| 


|et al.|2006 De Lucia et al.|2006^ . The primary advantage of the lat¬ 

ter approach is that it is computationally much less expensive com¬ 
pared to the hydrodynamic simulations, albeit with a large number 
of free parameters. However, even within the hydrodynamic sim¬ 
ulations some of the key processes, like the star formation from 
the collapse of molecular clouds, the stellar feedback from super¬ 
novae explosions and galactic winds, and the impact of the active 
galactic nuclei (i.e., the AGN feedback), are happening on scales 
well below the resolution limit, so the treatment of the physics is 
often at the “subgrid” level, i.e., put in by hand using empirical 
scaling relations (see |Benson|[2010| for a review). Both the hy¬ 
drodynamic simulations and the SAMs nonetheless have enjoyed 
great success over the past decade, reproducing a wide range of 
the observed galaxy properties (see Oppenheimer & Dav^[2008| 
[Khandai et al.||20T4| and [Vogelsberger et al.||2014| for the latest 
hydrodynamic simulations and see |Guo et al. [20 11] for the recent 
development in SAM), though still with several possible areas of 
improvement. Most importantly, because the simulated outputs are 
directly tied to the prescribed subgrid physics and its parameters, by 
confronting the predictions of this “forward modelling” approach 
with new observations, we can constantly furnish our understand¬ 
ing of the galaxy formation physics in a cosmological context, es¬ 
pecially the baryonic feedback mechanisms that help sculpt the 
galaxy SMF ( [Fontanot et al.|[2()09^ . However, the computational 
complexities of hydrodynamic simulations and the SAM degenera¬ 
cies make it difficult to come to definite conclusions about galaxy 
formation theories in certain cases. 

Alternatively, one can focus on the statistical link between Just 
the stellar content and the dark matter halos, assuming that the enor¬ 
mous diversity in the individual galaxy assembly histories inside 
halos of the same mass would reduce to a stochastic scatter about 
the mean stellar-to-halo mass relation (SHMR) by virtue of the cen¬ 
tral limit theorem. The great success of the HOD framework in ex¬ 
plaining the clustering and lensing of galaxies, and their dependen¬ 
cies on galaxy properties like colour, luminosity, and stellar mass 
at different redshifts (e.g.,|Zehavi et al.|2011[|Parejko et al.|2013 [ 
|Guo et al.|20T4) , further validates this assumption about the sta¬ 
tistical simplicity of the SHMR. For a given cosmology, the HOD 
formalism describes the relationship between the stellar and dark 
matter in terms of {Ng{Mh)), the average number of galaxies of 
a given type (e.g., central or satellite) within a halo of virial mass 
Mh, along with the spatial and velocity distributions of galaxies 
within that halo (see |Yang et al.|200^ for a closely related approach, 
the conditional luminosity function). Furthermore, the HOD is po¬ 
tentially a powerful tool to constrain cosmology ( YTOe£^alJ200^ 


Zheng^Weinbergj2TO7[ [van den Bosch et al.|2013[ Cacciato et al.| 

2013[|More et al.|2014^ , by exploiting the average bias vs. mass re¬ 
lation of the dark matter halos revealed by the galaxies they contain. 
The standard HOD modelling is restricted to individual volume- 
limited samples, each defined by a stringent combination of stellar 
mass and redshift cuts, which leaves many observed galaxies un¬ 
used. Moreover, the corresponding HOD parameters are inferred 
separately for each sample, therefore the SHMR is constrained at 


a disjunct set of loci where the mean stellar masses of the samples 
landed. 

Recently, [Leauthaud et ah] (2011; hereafter Lll) proposed 
a new HOD framework by further parameterizing {Ng{Mh)} 
as continuous functions of the threshold stellar mass M*, i.e., 
{Ng{>Mt\Mh)). In particular, the mean SHMR of central galax¬ 
ies and its scatter fully specify the expected number of central 
galaxies of any stellar mass at fixed halo mass, whereas the num¬ 
ber of satellite galaxies above a certain stellar mass scales with 
halo mass in a way that is also halo mass-dependent. The ad¬ 
vantage of this new framework lies in its unique capability to 
derive the connection between galaxies and halos using multiple 
probes simultaneously within a single global HOD model. |Leau-| 
|thaud et ^ (20 12b, hereafter L12) demonstrated the efficacy of the 
Lll model by inferring the SHMR across the entire observed stel¬ 
lar mass range, using the combination of the SMF, angular galaxy 
clustering, and galaxy-galaxy (g-g) lensing of samples above some 
critical stellar mass limit (corresponding to high completeness) in 
the COSMOS survey. However, similar to the standard HOD tech¬ 
nique, the Lll model requires volume completeness in the stel¬ 
lar mass samples, thereby losing a lot of data in an intrinsically 
flux-limited survey. Our approach is based on the Lll global HOD 
framework, and (like in Lll) can jointly fit the galaxy clustering 
and g-g lensing signals of galaxy samples above some critical stel¬ 
lar mass limit, but (unlike in LI 1) can take into account the incom¬ 
pleteness in a statistically self-consistent way. As a result of the 
added flexibility of our model, we are able to include significantly 
more SDSS galaxies at both lower stellar mass and higher redshift, 
equivalent to almost doubling the survey volume. 

Meanwhile, [Moster et al.| ( |2010| hereafter MIO) and other 
groups (e.g., |Kravtsov et ^|2004[ |Vale & Ostriker||2004[ |Con~ 


roy et al.]|2066 Shankar et al.||2006 [Behroozi et aL|2010[ Guo 


et al. 20 10^ adopted the so-called “sub-halo abundance match¬ 
ing” (SHAM) technique to assign stellar masses or luminosities 
to individual dark matter halos (including both main and subha¬ 
los) in the N-body simulations. In its simplest form, the SHAM 
method assumes a monotonic relation between the SMF estimated 
from galaxy surveys and the halo mass function predicted by the 
ACDM. Instead of using the subhalo mass at the current epoch, 
most SHAM studies employed the “infall mass”, i.e., the mass of 
the subhalo before its accretion onto the main halo, and instead of a 
monotonic relation they either assumed some fixed scatter or drew 
the scatter from external priors. Using the “infall mass” as a proxy 
for the stellar mass, they found surprisingly good agreement be¬ 
tween the predicted and the observed clustering and lensing statis¬ 
tics. The underlying assumption behind SHAM is that the satellite 
galaxies that live in subhalos were central galaxies of their own 
halos before accretion. After becoming satellites, the dark matter 
masses of their subhalos were reduced due to the combined ef¬ 
fect of tidal stripping and dynamical friction, whereas their stel¬ 
lar masses in the centre of those subhalos were much less affected 
and are thus more closely tied to the infall masses (Reddi ck et al.| 
|2013|[^n den Bosch et al.|2005^ . |Hearin & Watson| ( |2013t later ex- 
tended this assumption to further relate the halo formation time to 
the colour of each galaxy to interpret the colour dependence of the 
clustering and lensing statistics (also see |Kulier & Ostriker|2015| 
for an interesting proposal of a two-parameter abundance match¬ 
ing scheme). The key distinction between the SHAM and the HOD 
models is that, by using the infall mass the SHAM method evades 
the need to parameterize the HOD of the satellite galaxies (but see 
|Guo et al.|201 1[ on potential numerical issues in tracking subhalos), 
which is usually regarded as a nuance in traditional HOD mod- 
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elling, at the expense of assuming the same SHMRs for the central 
and the satellite galaxies. However, as we will demonstrate in this 
study, the capability of our modified HOD model to maximally ex¬ 
tract the information from data enables us to place tight constraint 
on the satellite HOD as well, thus shedding new insight into the 
formation and evolution of the satellite galaxies without imposing 
the central SHMR on the satellites. Moreover, the model breaks the 
degeneracy between the scatter and the slope of the SHMR, self- 
consistently deriving the scatter without the need to assume fixed 
values or external priors. 

The paper is organised as follows. Section describes the 
SDSS data, including the large-scale structure galaxy catalogue 
and the matched stellar mass catalogue, and the two sets of galaxy 
samples selected for our analyses. In Section we describe the 
methods we use to measure galaxy clustering and g-g lensing for 
these samples. Section introduces our new variant of the HOD 
model that we use to interpret the clustering and lensing signals, 
and Section 1^ describes the method to theoretically predict these 
signals using our HOD model. In Section we present the results 
of our constraint from a Bayesian framework via the new HOD 
modelling technique. We examine the SMF, the SHMR of the cen¬ 
tral galaxies, and the conditional SMFs of the satellites predicted 
by our best-fit model in Section|7] We summarise our main findings 
and discuss future applications of the new HOD model in Section]^ 

Throughout this paper, we assume a ACDM cosmology 
with (Dm, Oa, CTg,/i) = (0.26, 0.74,0.77,0.72). All the length 
and mass units in this paper are scaled as if the Hubble con¬ 
stant were 100kms“^Mpc“^. In particular, all the separa¬ 
tions are co-moving distances in units of either /i“^kpc or 
/i“^Mpc, and the stellar mass and halo mass are in units of 
and h~^M q, respectively. The halo mass is defined by 
Mj, = M 2 oom = 200pm(47r/3)r2oom. where r 2 oom is the corre¬ 
sponding halo radius within which the average density of the en¬ 
closed mass is 200 times the mean matter density of the Universe, 
pm- For the sake of simplicity, lna;= log^, x is used for the natural 
logarithm, and Ig x= logj^g x is used for the base-10 logarithm. 


2 DESCRIPTION OF THE DATA 

Here we describe the spectroscopic data used to define the cluster¬ 
ing and the lens samples, and the imaging data used to define the 
source samples for g-g lensing. 


2.1 SDSS Main Galaxy Sample and NYU-VAGC 


This study is based on the final data release of the SDSS (DRV; 
[Abazajian et al.|[2009[ ), which contains the completed data set of 
the SDSS-I and the SDSS-II. The survey imaged a quarter of the 
sky using a drift-scan camera ( |Gunn et al. |1998|> in fi ve photo¬ 
metric bandpasses (u, g, r, i, z; |Fukugita et al.||1996| > to a lim¬ 
iting magnitude of ~22.5 in the r band. The imaging data were 
photometrically and astrometrically calibrated (jPadmanabhan et al.| 
|2008| >, and from this imaging data targets were selected for spec¬ 
troscopic follow-ups with a fibre-fed double spectrograph ^Gunn| 
|et al.|2006|. One of the spectroscopic products is “the main galaxy 
sample” (MGS; [Strauss et al.||200^ that we use in this study as 
both the tracers of stellar mass clustering and the lenses of back¬ 
ground sources. In particular, we obtain the MGS data from the 
dr72 large-scale structure sample bright0 of the “New York 
University Value Added Catalogue” (NYU-VAGC), constructed as 
described injBlanton et al.|j2005^. The bright0 sample includes 


galaxies with 10<mr<17.6, where is the r-band Petrosian ap¬ 
parent magnitude, corrected for Galactic extinction. We choose a 
more relaxed bright limit than the commonly used safeO sam¬ 
ple (14.5<mr<17.6) to allow higher completeness on the high 
stellar mass end, where constraint on the mapping between galaxies 
and halos is particularly lacking. 


Due to the finite size of the fibre plugs, no two targets on the 
same plate can be closer than 55”, resulting in ~7% of the MGS 
galaxies with unknown redshifts. A simple remedy for these fibre 
collisions, as used in the bright 0 sample, is to assign these galax¬ 
ies the redshifts of their nearest neighbours on the sky, thus exactly 
preserving the angular clustering signal. For our purpose of measur¬ 
ing the projected correlation Wp at a given redshift z, it would over¬ 
estimate the clustering signal below the projected physical scale 
corresponding to the fibre size at that redshift, (z), by including 
physically distant pairs of galaxies that happen to align within 55” 
on the sky. Above r^{z), however, this “nearest-neighbour” red¬ 
shift assignment scheme recovers the underlying Wp remarkably 
well ( jZehavi et al.||2005| >. We thus limit our Wp measurement to 
those scales above. We do not consider other more sophisticated 
corrections that would poten tially recover the signals below the fi¬ 
bre radius i Guo et al.|2012 i, as is always below 0.17fi“^Mpc 
across the redshift range of our samples (2;niax=0.30). Further¬ 
more, even at the highest redshifts where is relatively large, 
the samples are progressively dominated by more massive galaxies 
that have a larger correlation length, so the one-halo term is still 
well resolved in the clustering measurement above r^. For the g- 
g lensing measurement, this correction using the nearest neighbour 
slightly blurs the signals around those “collided” lenses in the trans¬ 
verse direction because of the inaccurate conversion from angles to 
physical separations. This blurring effect can be safely ignored in 
our analysis as it does not reduce the overall amplitude of the aver¬ 
age lensing signal. 


In addition to the angular positions and redshifts of the galax¬ 
ies, the NYU-VAGC also calculated k-corrections for all the 
MGS galaxies using templates based on stellar population synthe¬ 
sis (SPS) models, while providing approximate stellar mass es¬ 
timates and star formation histories for all the bright0 galax¬ 
ies ( [Blanton & Berlind||2007|. Comparison with the stellar mass 
estimates from Kauffmann et al.[ ( |2003[ l shows a scatter of only 0.1 
dex and mean biases of less than 0.2 dex depending on stellar mass. 
This tight relationship, as we show later in Section f22\ provides 
crucial information for estimating stellar masses for galaxies that 
do not have valid entries in the MPA/JHU stellar mass catalogue 
(an updated version of the [Kauffmann et al.|2003[ catalogue) that 
we adopted. 


To minimise any potential artefacts in the measurements due 
to irregular survey geometry and areas with low spectroscopic cov¬ 
erage, we use data exclusively within the contiguous area in the 
North Galactic Cap and from sectors (each defined by a region 
covered by a unique set of plates) where the angular completeness 
is greater than 0.8. The final sample used for the galaxy cluster¬ 
ing analysis includes 513,150 galaxies over a sky area of 6395.49 
deg^. A further 5 per cent of the area is eliminated for the lensing 
analysis, due to the absence of source galaxies in that area (for ex¬ 
ample, because of poor imaging quality or more conservative mask¬ 
ing around bright stars). 
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2.2 Stellar Mass Estimate 


We employ the stellar mass estimates from the latest MPA/JHU 
value-added galaxy catalogu^ The stellar masses were estimated 
based on fits to the SDSS photometry following the philosophy 
of [Kauffmann et ^ j2003) and [Salim et al.| ( |2007| l, and assum¬ 
ing the^^ChabrierJChSnerj200^ initial mass function (IMF) and 
the[Bruzual & Chariot | 2003^ SPS model. The estimation is very 
similar to the spectroscopic fits as described in [Kauffmann et al.| 
( |20Q3^ , but instead of spectral features like the D4000 index and 
the H5 absorption lines, the broad-band photometry (after correc¬ 
tion for emission lines) is used for the fits. The key difference be¬ 
tween the photometry-based estimates and the spectroscopic ones 
is that the stellar mass-to-light ratio (M*/L) estimated from spec¬ 
troscopic features is only representative for the region sampled by 
the fibre, and a constant M,/L must then be assumed to extrapo¬ 
late to a total stellar mass. To avoid extrapolation, we employ the 
total mass estimates obtained through direct fits to the SDSS total 
photometry, i.e., the cmodel magnitudes. 

We then match the MPA/JFIU stellar mass catalogue to the 
NYU-VAGC bright0 sample and identify valid, unambiguous 
MPA/JHU stellar mass estimates for all but 32,327 (6.3%) of the 
MGS galaxies. Although it is unclear what causes the failure in esti¬ 
mating stellar mass for these systems, they are statistically indistin¬ 
guishable from the matched population in redshift, luminosity, flux, 
and colour. Therefore, given the tight scaling relationship between 
the NYU-VAGC stellar mass Mf and the MPA/JHU mass M^'‘, 
we can recover the individual photometric stellar mass estimates 
for these unmatched galaxies in a way that is statistically consistent 
with the matched ones. We first divide the brightO galaxies into 
red and blue populations based on their g—r colours (> or < 0.8; 
k-corrected to z=0.1), because the scaling between the two types 
of estimates is colour-dependent and is much tighter within each 
single-colour population than the entire sample jBell & de Jongj 
[200 Ij l. Within each colour, we then compute the probability distri¬ 
bution of at each fixed using the successfully matched 
galaxies, and assign each unmatched galaxy with a random 
stellar mass drawn from that distribution. 

Stellar mass estimation is subject to various theoretical un¬ 
certainties in predicting the Mt/L, due to the choice of the SPS 
model, dust extinction law, stellar evolution model, and most im¬ 
portantly, the form of the IMF (see |Conroy|2013| for a general re¬ 
view on these topics and L12 for a detailed discussion of the vari¬ 
ous systematic uncertainties in stellar mass estimation). Assuming 
no trend in IMF with galaxy type, environment, or redshif^ the 
range of different IMFs causes uncertainty in the absolute normal¬ 
isation of the Mt/L of factors of two to several depending on the 
passband ^Bell & de Jong|[T001| >, thus shifting the inferred SMF 
horizontally while maintaining the same shape. The second major 
source of uncertainty lies in the difficulty in correctly measuring 
the total fluxes of individual galaxies from aperture photometry. 
The SDSS pipeline returns a fraction of the total light even when 
the cmodel magnitudes are used ( [Abazajian et al.|2009^ , due to 
a combination of over-subtraction of the sky background and poor 
model fits to the light profiles ( jBemardi et al.|20T^ . This issue is 
particularly acute for massive galaxies, where the measured num¬ 
ber density of galaxies above 3 x 10^^h~^M q may be underes- 


http://home.strw.leidenuniv.nl/~jarle/S PSS/ 
For evidence suggesting non-universal IMF's, however, see 


Dokkum] 


|& Conroy|j2010|,[Conroy & van DokkumN2012},|Cappellari et al.N20121, 
andjFen'eias et al.ji 2013[. 


timated by as much as a factor of five. Unfortunately, there is no 
well-calibrated correction that can be applied to the MPA/JHU cat¬ 
alogue. Therefore, we use the MPA/JHU catalogue as-is and focus 
on the particular mapping between the MP/VJHU stellar masses 
to dark matter halos. For the purpose of this mapping, both types 
of systematic errors on the stellar mass are rather benign, as long 
as the ranking order of galaxies in their stellar mass estimates is 
largely unperturbed. Unless otherwise noted, we will refer to the 
MPA/JHU stellar mass estimate simply as the “stellar mass” and 
denote it with M* for the rest of the paper. 


2.3 Stellar Mass Sample Selection 

The flux-limited nature of surveys like the SDSS makes it very 
difficult to select volume-limited galaxy samples thresholded or 
binned in stellar mass, due to the large spread in M*/L at fixed 
M* . However, the M* /L distribution of the red, quiescent galaxies 
is much narrower than that of the blue, star-forming galaxies (jGal-j 
[lazzi & Bell|2009| l, therefore at any given redshift the observed red 
population has a sharp cutoff at low stellar mass, where the ob¬ 
served galaxies are dominated by the blue population near the flux 
limit. This phenomenon is best illustrated in Figure [^, where we 
show the average g—r colour at each (M,, z) location, colour- 
coded by the colourbar on top. The map is well separated into two 
regimes based on the mixing of galaxy colours by the narrow streak 
of the (g—r) = 0.8 population, which we hereafter refer to as the 
“mixture” limit, analogous and related to the “flux” limit on a lu¬ 
minosity vs. 2 ; diagram. 

In Figure [T^, above the mixture limit the variation of the 
average g—r colour with M* is largely uniform across all red- 
shifts below 0.1, with a gradual and smooth transition at M* ~ 
10 ^°/i“^Mq as a result of the galaxy evolution physics (a.k.a, 
downsizing; [Cowie et al.|1996^ . The sharp colour transition across 
the mixture limit, however, is purely a manifestation of the flux- 
limited nature of the sample, combined with the fact that quiescent 
galaxies produce much less light than their star-forming counter¬ 
parts at the same M* due to their higher M,/L ratio. The sharpness 
of the transition corresponds to the narrow width of the M, /L dis¬ 
tribution of these quiescent galaxies. Therefore, the mixture limit 
provides a simple empirical guideline for selecting stellar mass 
samples of relatively high volume-completeness, circumventing the 
problem of theoretically determining the maximum M*/L ratio for 
galaxies at the flux limit of each redshift. 

Since our goal is to infer the stellar-to-halo mass mapping for 
the average galaxy population with a fair mix of both the quiescent 
and the star-forming galaxies, which occupy dark matter halos in 
different ways, we will restrict our analysis to the galaxies above 
the “mixture” limit (see Section [4.1[ for an additional factor for 
making this choice of restriction). The functional form we adopt 
to describe the mixture limit is 

Ig = 5-4 X (^ - 0.025)°'^" + 8.0, (1) 

i.e., the white curves shown in Figure [TJ? and Figure [TJ;, slightly 
more conservative than the {g—r)=0.8 streak in Figure [^. We 
then define different stellar mass samples above this mixture limit 
for the clustering and lensing measurements in Figure [T]} and[TJ;, 
which show the distributions of galaxy number counts and num¬ 
ber densities, respectively, on the M,-z diagram. The thick black 
boxes in Figure [TJi represent a typical sample selection scheme 
adopted by the traditional HOD analysis. In order to predict the 
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Red shift 
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Figure 1. Distributions of the mean g — r rest-frame colour (panel a), the observed number counts (panel b), and the co-moving number density (panel c) 
of SDSS DR7 galaxies on the stellar mass vs. redshift plane, with colourbars showing on the top. In the middle panel, the thick “boxes” indicate the galaxy 
samples selected for the traditional HOD analysis (cHOD), while the thin “wedges” represent the omitted region where galaxies are the most abundant. The 
gray regions in the right panel are the galaxy samples that were used for our fiducial iHOD analysis. The curved lower boundary of the selection is determined 
by the mixture limit below which the red galaxy population is severely underrepresented in the spectroscopic survey, and is derived from the distinctive white 
stripe i{g — r) = 0.8) in the left panel (see Equation[^. 


Table 1. The two sets of stellar mass bins used for the iHOD and cHOD analyses, corresponding to the dark and gray thick selections in Figure^ and^, 
respectively. The iHOD analysis includes two extra stellar mass samples below lgM*=9.8, while the higher stellar mass samples have the same binning in 
Ig M, and the same minimum redshifts Zmin for the two analyses. However, all the iHOD samples have more extended redshift ranges, thresholded by the 
mixture limit defined in Equation jTJ on the far side, so the maximum redshift Zmax listed for each sample is the maximum redshift of any galaxies in that 
sample. For the Zmax and Ng we list the corresponding numbers for the cHOD analysis in parenthesis. The average large-scale bias for all, central, and satellite 
galaxies, the average logarithmic halo mass corresponding to the central galaxies, and the satellite fraction of each iHOD sample (with 68% uncertainties) are 
derived from the fiducial iHOD constraint for the corresponding iHOD samples. 
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clustering and lensing signals, the HOD model has to assume ei¬ 
ther volume-completeness of the sample within those boxes, or an 
ad hoc prescription describing the completeness as function of M* 
and/or 2 (e.g., |Miyatake et al.|[2013) . Both assumptions are less 
than ideal due to our ignorance about the M,/L ratio distribution 
of galaxies. Furthermore, the rectangular shape of the selections, 
imposed merely for the sake of modelling convenience, inevitably 
misses the regions where the galaxies are the most abundant (thin 
gray “wedges”) on the 2D histogram because they occupy a much 
larger co-moving volume per unit redshift compared to the lower 
redshift “boxes”. In principle the traditional HOD analysis can in¬ 
clude more galaxies inside those “wedges” by adopting finer stel¬ 
lar mass bins with fewer galaxy per sample, but the measurement 
signal from each individual sample would be very noisy, render¬ 
ing this scheme highly impractical. In this study we develop two 


novel improvements over the traditional HOD approach: 1) to sta¬ 
tistically account for the sample incompleteness in a self-consistent 
way, and 2) to be able to predict the clustering and lensing signal 
for all the galaxies above the mixture limit (thick gray selections in 
Figure [TJ;). We hereafter refer to the traditional HOD approach as 
the cHOD and our improved version the IHOD, where c and i are 
loosely tied to “completeness” and “incompleteness”, respectively. 
Table[T]summarises the basic information of the two sets of sample 
selections used by the two modelling methods. In total, we select 
314,302 (61% of the bright0 sample) galaxies for the iHOD 
analysis, and from them 170,483 (54% of the iHOD galaxies) are 
used for the cHOD analysis. 


© 0000 RAS, MNRAS 000, 000-000 

































6 


Zu & Mandelbaum 




smaller than expected from assuming the errors are pure Poisson, 
due to the strong cosmic variance in these samples. These improve¬ 
ments in the S/N of the measurements are crucial to the success of 
the iHOD model in robustly solving the mapping between galaxies 
and halos without resorting to any external information (e.g., the 
SMF) or strong priors on the model parameters (e.g., the scatter in 
the SHMR). 

We now describe how we measure the projected correlation 
function from galaxy pair counting and the surface density contrast 
from g-g lensing, and the reader who are familiar with the technical 
details can skip the remainder of this section. 


3.1 Projected Galaxy Correlation Function 

We measure the projected correlation function Wp for each galaxy 
sample by integrating the 2D redshift-space correlation function 


Figure 2. Comparison between the clustering (left) and lensing (right) mea¬ 
surements for the iHOD (coloured solid) and cHOD (gray dashed) samples 
of the same stellar mass ranges, marked at the beginning of each pair of 
curves in the left panel. In each panel, the coloured and hatched bands rep¬ 
resent the sizes of the uncertainties in the corresponding measurements, and 
the number marked at the end of each pair of bands is the ratio between the 
average signal-to-noise ratios of the iHOD and the cHOD measurements, all 
>1 because of the extra galaxies included in the iHOD samples. To avoid 
clutter, the units of the projected correlation function (left) and the g-g lens¬ 
ing signals (right) for each stellar mass bin are scaled arbitrarily. 


2.4 Source catalogue 

As sources for the g-g lensing measurement, we use a catalogue 
of background galaxies jReyes et al.||2012l l with a number den¬ 
sity of 1.2 arcmin“^ with weak lensing shears estimated using the 
re-Gaussianization method jHirata & Seljak|2003^ and photomet¬ 
ric redshifts from Zurich Extragalactic Bayesian Redshift Analyzer 
(ZEBRA, |Feldmann et al.|2006^ . The catalogue was characterised 
in several papers that describe the data, and use both the data and 


simulations to estimate systematic errors (see 

Reyes et aL|2012| 

Mandelbaum et al.|2012||Nakajima et al.|2012| 

Mandelbaum et al.| 

2m3|l. 


3 MEASURING GALAXY CLUSTERING AND 
GALAXY-GALAXY LENSING 

Figurej^compares the clustering and g-g lensing signals measured 
for the six pairs of iHOD (solid) and cHOD (dashed) samples that 
share the same stellar mass ranges (marked in the left panel). In 
each panel, the coloured and the hatched bands illustrate the sizes 
of the measurement uncertainties (i.e., the diagonal of the error ma¬ 
trices) in the iHOD and the cHOD cases, respectively. At the end 
of each pair of uncertainty bands, we mark the ratio between the 
average signal-to-noise ratios {S/N) of the iHOD and the cHOD 
measurements. The improvement in the S/N of the g-g lensing 
measurements (right panel) is consistent with the reduction in the 
Poisson errors by factors of yjN/j/Ng, where N^ and Ng are the 
numbers of galaxies in the iHOD and the cHOD samples, respec¬ 
tively (the 4th column of Table [TJ. For the clustering measure¬ 
ments (left panel), the improvement for the three higher mass bins 
is consistent with the expected decrease in the Poisson error by 
Ng/Ng, while for the three lower mass bins the improvement is 



C{rp,r^)dr^, 


( 2 ) 


where Vp and r^r are the projected and the line-of-sight (LOS) co¬ 
moving distances between two galaxies. We measure the Wp sig¬ 
nal out to a maximum projected distance of = 20/i“^Mpc, 
where the galaxy bias is approximately linear. For the integration 
limit, we adopt a maximum LOS distance of r™'‘’‘=60/i“^Mpc, 
where the average pairwise peculiar velocity is very small j Tinker| 
et al.|20d^ . Based on a mildly modified linear Kaiser formalism, 
van den Bosch et al.| j2013[ > showed that the Wp measured with 
r™‘‘’‘=60/i“^Mpc should be boosted by a few to 8% depending 
on Tp, in order to correct for the residual redshift-space distor¬ 
tion (RRSD) effect due to the small but non-zero peculiar velocity 
beyond r.,r>60/i“^Mpc (see their figure 6). The boost was cali¬ 
brated for a mock galaxy sample binned in luminosity with differ¬ 
ent cosmology, therefore it is not suitable to directly apply it to our 
measurement. However, the impact of the RRSD effect on our re¬ 
sult, especially on the SHMR, can be roughly estimated as follows: 
in the most extreme case that all the constraining power comes from 
the clustering at large scales, based on the average bias vs. halo 
mass relation for the central galaxies listed in Table [T] we can in¬ 
fer that a 5% boost in Wp translates to 2.5% in galaxy bias, which 
translates to to ~0.015 dex in halo mass, shifting the SHMR toward 
the lower mass end by 0.015 dex. This shift is much smaller than 
the scatter in the halo mass at fixed stellar mass and our constraint is 
not solely coming from Wp, therefore the impact of ignoring RRSD 
should be negligible. 

We compute the 2D correlation using the Landy-Szalay esti¬ 
mator jLandy & Szalay|1993| >, 


C{rp,r^) 


DD - 2DR + RR 


(3) 


where DD, DR, and RR represent the number counts of pairs 
of two data galaxies, one data and one random galaxies, and 
two random galaxies, respectively. The Landy-Szalay estimator 
has minimal variance and is insensitive to the number of random 
points ( [Kerscher et al.|[2()00^ . For each given galaxy, we find all 
the neighbouring galaxies within a cylinder of radius and 

height centered on that galaxy, using the libkd package 

within the Astrometry. net software jLang et al.||2010^ . We 
then count all the pairs in each cell centered on (rp,r,r) and the 
three sets of pair counts thus directly give the value of ^“{rp, r-^r) 
via Equation S- 


© 0000 RAS, MNRAS 000, 000-000 























































Linking Stellar to Dark Matter 1 


The error covariance matrix for each w-p measurement is es¬ 
timated via the jackknife resampling technique (|Norberg et al.| 
|2009| l. We divide the entire footprint into 200 spatially contigu¬ 
ous, roughly equal-size patches on the sky and compute the Wp for 
each of the 200 jackknife subsamples by leaving out one patch at 
a time. For each stellar mass sample, we adopt the sample mean of 
the 200 subsample measurements as our final estimate of Wp, and 
the sample covariance matrix as an approximate to the underlying 
error covariance. 

The random galaxy catalogue was constructed following two 
steps. We first generate a sample of random positions on the sky 
with 10 times the size of the data catalogue, using the mangle 
software Hamilton & Tegmark|2004l|Swanson et al.|2008| l and the 
bright 0 angular selection function provided on the NYU-VAGC 
website. Secondly, we calculate the 2D joint distribution of stel¬ 
lar mass and redshift from the data catalogue, and draw 5131, 500 
random pairs of (M*, z) values to assign to the stored random po¬ 
sitions. In this way, we ensure the random galaxy catalogue has 
exactly the same angular, radial, and stellar mass joint selection 
functions as the data galaxies. We apply the same sample selection 
criterion to the random catalogue and the data sample. As men¬ 
tioned in Section [TT[ we only use the Wp values down to the phys¬ 
ical distance that corresponds to the fibre radius at the maximum 
redshift of that sample. 

The Wp measurements for the eight iHOD stellar mass sam¬ 
ples are shown as the solid circles with errorbars in the top sub¬ 
panels in Figure]^ The errorbars reflect the diagonal components 
of the jackknife covariance matrices. Due to the strong cosmic vari¬ 
ance effects in the low stellar mass samples (lgM*< 10 . 6 ), overall 
the off-diagonal components (not shown here) are strong and per¬ 
sist on both small and larges scales, while for the high stellar mass 
ones (Ig M,> 11 . 0 ) they are only prominent on scales larger than 
5h“^Mpc and between two adjacent distance bins, i.e., along the 
diagonal blocks. We will refer back to Figurej^and discuss in more 
detail the comparison between the measurements and the predic¬ 
tions from our best-fit model in Sectionj^ 


3.2 Surface Density Contrast from Galaxy-Galaxy Leasing 


Here we describe how we measure the surface density contrast from 
g-g lensing. The lensing measurement begins with identification 
of background source galaxies around each lens (with photometric 
redshift larger than the lens spectroscopic redshift). Inverse vari¬ 
ance weights are assigned to each lens-source pair, including both 
shape noise and measurement error terms in the variance: 


Wls 




(4) 


source pairs “rs”: 


AE(rp) 


2 '^ Yirs 


( 6 ) 


where et is the tangential ellipticity component of the source galaxy 
with respect to the lens position, the factor of 2TZ converts our def¬ 
inition of ellipticity to the tangential shear 74 , and rp is the co¬ 
moving projected radius from the lens. The division by "^2 Wrs ac¬ 
counts for the fact that some of our ‘sources’ are physically asso¬ 
ciated with the lens, and therefore not lensed by it (see, e.g., |SheF| 
|don et al.|2004^ . Finally, we subtract off a similar signal measured 
around random lenses, to subtract off any coherent systematic shear 
contributions jMandelbaum et al.|2005| >; this signal is statistically 
consistent with zero for all scales used in this work. 

To calculate the error bars, we also used the jackknife resam¬ 
pling method. As shown in |Mandelbaum et al.H2005) , internal es¬ 
timators of error bars (in that case, bootstrap rather than jackknife) 
perform consistently with external estimators of errorbars for AE 
due to its being dominated by shape noise. 

Use of photometric redshifts, which have nonzero bias and 
significant scatter, gives rise to a bias in the signals that can be eas¬ 
ily corrected using the method from |Nakajima et al.|p012l l. This 
bias is a function of lens redshift, and is properly calculated includ¬ 
ing all weight factors for each lens sample taking into account its 
redshift distribution. For typical lens samples in this work, the bias 
for which we apply a correction is of order 1 per cent, far below the 
statistical errors; the maximum is slightly below 10 per cent. 


4 MODEL FOR MAPPING THE STELLAR CONTENT 
TO HALOS 

4.1 A Tale of Two HODs: cHOD vs. iHOD 

The key to statistically solving the mapping between stellar content 
and dark matter halos is p(M*, Mh), the 2D joint probability den¬ 
sity distribution of a galaxy with stellar mass M* sitting in a halo 
of mass Mh, normalised so that /f p{M,, Mh) dM, For 

analysing a galaxy sample with stellar mass range [M* , Ml\ and 
redshift range [ 2 : 0 , 21 ], it is common practice for HOD models to di¬ 
rectly parameterize the occupation number as function of halo mass 
for the entire sample, {Ng{Mh)), which is related to p{Mt, M^) 
via 


{Ng{Mn)) = 


Ug / dn 
2l - 2o V dM,, 


nzi 

Jzq J m2 


p{Mt,Mh)fobaiMt\Mh,z)dzdMt, (7) 


where is the shape measurement error due to pixel noise, and 
(JsN is the RMS intrinsic ellipticity (both quantities are per compo¬ 
nent, rather than total; the latter is fixed to 0.365 following |Reyes| 
|et al.| 2012 ll. Ecrit is the critical surface mass density defined by 




TtfCt DisDi{l ziY 
IDs 


(5) 


where dn/dMh is the halo mass function, Ug is the total galaxy 
number density in the Universe (both observed and unobserved), 
and /oba is the detection rate that varies between 0 and 1. At any 
given redshift, the detection rate can be explicitly written as 

rfmax —(^) 

/obs(M*|Mh, 2 ) = / 5 (r|M.,M„)dr, (8) 

Jo 


where Di and Ds are the angular diameter distances to lens and 
source, and Du is the distance between them. We use the estimated 
photometric redshift each source to compute Ds and Du - The fac¬ 
tor of (1 + ZiY comes from our use of co-moving coordinates. 

The projected mass density in each radial bin can be com¬ 
puted via a summation over lens-source pairs “Is” and random lens- 


where g{T\Mt, Mh) is the distribution of the stellar mass-to-light 
ratio r=Mt/L of galaxies at fixed M* within halos of mass Mh, 
and Lniin{z) is the luminosity threshold corresponding to the flux 
limit at 2 . At low redshift, Lyain{z) is low (large Fmax), so the 
integral extends over a wide range of F values and gives a result 
that approaches 1 , while at higher redshift, L^nwiz) is high (small 
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Tmax), limiting the range of accessible F values severely and thus 
lowering /obs- 

In order to describe the sample using a single HOD, /obs 
must be uniform across the redshift range, i.e., fohs{^*\Mh, z) = 
/oba(M*|Mh), which happens if and only if foha{M,\Mh) = 1, 
so that 





(9) 


Following the standard procedure in HOD modelling, we adopt 
Equation for the cHOD model, writing separate contributions to 
p{M,, Mh) for central and satellite galaxies, and derive constraint 
by applying it to the samples defined in Figure [TJ). For the sake 
of comparison to the iHOD constraint, we do not use the number 
density of sample galaxies as an input to the cHOD constraints. 

However, due to the increase of the luminosity threshold with 
redshift for a flux-limited sample, the M*-selected galaxy popula¬ 
tions are strictly stratified in z and the HDDs at different redshifts 
must be treated separately. At fixed redshift z, the HOD is 


{N,{Mu\z)) = 

/ p{M.,Mh)UdMAMh,z)dM.. (10) 

\aMh J Jmo 

Therefore, in order to predict {Ng{Mh\z)} from p{Mt, Mh) we 
need to know fobs (AT* | Mh, z), which is inaccessible to us because 
of our ignorance of g{r\Mt, Mh)- However, we can rewrite the 
above Equation as 


Equation we have 

(-2) 

U.{M.\Mh,z)= / {fr.diM*\Mh)gre4r\M„Mh) 

Jo 

+ [1 -/,ed(M.|M„)](7biue(r|M*,M;,)} dF, (13) 

where /red(AT*|Mft) is the intrinsic fraction of red galaxies at 
given Mt within halos of mass Mh- To understand the potential 
dependence of fobsiM,\Mh, z) on the halo mass Mh, we first ex¬ 
amine the variations of fred{Mt\Mh) and flred/biue(r|AT*, M^) 
with Mh separately and then combine them using the above Equa¬ 
tion. Since the Mt,/L ratio F is very tightly correlated with galaxy 
colour c ( [Bell & de Jong|2001) , we can instead look at the colour 
distributions of the two populations, (7red/biue(c|M,,Mji), each 
with a centroid position c and a spread Ac. By analysing the varia¬ 
tion in the galaxy colour bimodality with stellar mass and projected 
neighbour density using SDSS, |Baldry et alT| ( |2006^ found that c and 
Ac of the red and blue populations are stable across different envi¬ 
ronments of void, field, and groups and clusters (i.e., halo masses), 
while the red fraction /red increases continuously with M* and 
the local density of the environment, i.e., the so-called “mass” and 
“environmental” quenching, respectively jPeng et al.|2012^ . This 
stability in the colour distribution within each coloured population 
against the environment implies (jibiue(F|M*, Mh)=(7biue(F|M*) 
and (?red(F|M*, Mh)=gved{JJ\Mt), so that Equation can be 
simplified as 


fabs{M,\Mh, z) = f.,sd{Mt\Mh)Grsd{M*\z) 

+ [1 “ /red (AT* I ATj,)] Gblue(AT* l^), (14) 

where 


{Ng{Mh\z)) = 

f dn \ 

(dA^) J^io P(AT,r|AT*)[$(AT*)/ob.(AT*|AT;r,2)] dAT*, 

( 11 ) 

by using Bayes’ Theorem p{Mh\Mt)=p{Mt, Mh)/p{Mt) and 
the definition of the parent SMF ‘I>(AT*)=ng p(Mt)- To make fur¬ 
ther progress in our predictions, we adopt the ansatz that above the 
mixture limit Mt>Mf^'^{z), the dependence of fobs{Mt\Mh, z) 
on the halo mass is very weak, i.e., /obs(AT, |AT(,, z)^fobs{Mt\z)- 
Applying this ansatz to Equation GD we arrive at 

(NgiMhlz)) = / p(ATh|AT.)<E>obs(AT*|2)dAT., 

\aMh J Jmo 

( 12 ) 

where <l>obs(AT*|2)=<l?(AT*)/obs(AT*|z) is the obserx’ed SMF at 
redshift z, directly accessible from the survey. For modelling the 
samples defined in Figure [T]: for the iHOD analysis, we measure 
the observed galaxy SMF at each redshift, and then obtain the HOD 
for that redshift slice using Equation ( |12^ . In this way, we avoid the 
need to explicitly model the incompleteness as a function of AT* 
and/or Mh- Although it appears from Equation G3 that both the 
amplitude and the shape of i&obs are used to derive {Ng{Mh\z)), in 
essence we only use the shape as an input to the iHOD constraint, 
because the normalisation of {Ng{Mh\z)) is irrelevant to the pre¬ 
diction of the clustering and lensing signals. 

Before going any further, here we will lay out the theoretical 
arguments leading to the ansatz and defer the detailed discussion on 
its validation using consistency checks later in Section [6(^ and [7T] 
Splitting the galaxies into red and blue populations explicitly in 


{“tt max ( 2 ) 

aed(M.|^)= / p,ed(r|M.)dr 

^0 


Gbli 


rtt max ( 2 ) 

dMt\z)= ffbiue(F|AT*)dF. (15) 

Jo 


are the fractions (from 0 to 1) of the red/blue galaxies with AT* 
at 2 that will be observed given the flux limit and the separate 
red/blue distributions of AT*/L ratios. Above the mixture limit, 
the completeness of both the red and blue galaxies are rela¬ 
tively high, so that Gred(AT*|2)«Gbiue(AT*|2)=/obs(AT*|2) 
and Equation gives /obs(AT*|ATh,2)=/obs(AT.|2). Below 
the mixture limit, however, the observed red galaxies are so 
scarce that Gred(AT*|2)~0, yielding a halo mass-dependent 
fobs {Mt I ATh, 2)~ [1 - /red (AT* | AT^)] Gbiue(M*|2) because 
/i.ed(AT*|AT/i) is sensitive to Mh l George et al.|2Ql I) . 

We expect the ansatz to be largely valid except for the low 
mass samples {Mt<W^^h~^M q) where the colour bimodality 


might shift with halo mass because of galaxy evolution (|Ta ylor| 
|et al.|[20T5) . We nonetheless proceed to apply this ansatz to the 
entire stellar mass range, relying on the fact that the statistical 
power of the low mass samples is quite low, and discuss the pos¬ 
sible impact on our results later in Section [7T| Einally, we empha¬ 
sise that this ansatz assuming /obs(AT*|AT?i, 2 )~/obs(AT*| 2 ), ex¬ 
plicitly made in the iHOD model to account for the fact that for 
any given AT* we observed fewer intrinsically high- AT* /L sys¬ 
tems at higher redshifts, is a much weaker assumption than required 
by the traditional HOD models, which assume all the high- AT* /L 
systems were detected in the sample, i.e., /obs(ATt | ATji, 2 ) = 1. In 
other words, since /obs(AT*|AT(,, 2)=1 is a sufficient condition 
for /obs(AT*|AT(i, 2 )=/obs(AT*| 2 ), the iHOD model includes the 
cHOD model as a subset. 
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It is important to point out that during the constraint, the iHOD 
model remains entirely agnostic of the overall amplitude of the 
SMF, whether it he the parent SMF (normalised by rig) or the ob¬ 
served ones (normalised by the product of rig and /obs(Af*| 2 )), 
because neither rig nor fohs{Mt\z) is known a priori. However, 
the IHOD model is built on top of the halo mass function, which 
has a fixed normalisation in any given ACDM cosmology. Thus, 
once constrained by the clustering and lensing data, the best-fit 
iHOD model would give an explicit prediction for both the shape 
and the amplitude of the parent SMF. This prediction automati¬ 
cally gives an estimate of fohs{Mt,\z) when compared to the ob¬ 
served SMF at redshift z. The estimated fobs{Mt,\z) thus provides 
a useful consistency check of the iHOD model. Any significant de¬ 
parture of fobs from unity at stellar mass above the mixture limit 
would indicate failure of our model assumptions (e.g., the ansatz 
about the weak dependence of /obs(AT*| 2 ;) on M^) and/or degen¬ 
eracies in the model parameters (e.g., residual degeneracy between 
scatter and amplitude of the SHMR), while a successful model, as 
we will discover in Section [TJ] would reproduce /obs curves that 
are slightly below unity. Another consistency check is the compar¬ 
ison between the SHMR constraints from the cHOD and the iHOD 
models on the high mass end — in the regime of M,^M^'^{z) 
where fobs{Mt\Mh, «)—>!, the iHOD model becomes identical to 
the cHOD model and the two set of constraints, as will be shown 
later in Section [6^ should be consistent with each other. 

In summary, both the cHOD and iHOD models rely on the pre¬ 
diction of p{Mt,, Mh). The primary difference is that the cHOD 
model employs a single HOD by assuming that the galaxy sample 
is volume-complete in stellar mass, while the iHOD model is able 
to self-consistently take into account the redshift-dependent selec¬ 
tion function of the samples by working in narrow redshift slices 
and using information from the shape of the observed galaxy SMF 
at each slice. This improvement in iHOD enables us to include 84% 
more galaxies than used in cHOD by adding the low mass galax¬ 
ies Mq) in the local universe as well as more dis¬ 

tant galaxies, hence the significant improvement in the S/N of the 
measurements shown in Figure]^ This process cannot be extended 
indefinitely, since at lower stellar mass (below the mixture limit for 
a given redshift) the basic assumption behind IHOD fails, but the 
range of stellar masses where it is applicable is still large enough 
for significant improvements. 


4.2 Deriving the Two HODs 

Here we describe the theoretical framework for predicting the 
HODs for both the cHOD and iHOD models. In particular, we 
first model the total number of galaxies (both observed and un¬ 
observed) per log-stellar mass within halos of some fixed mass, 
dA'^(M*|Mii)/dlg M*, and then compute the joint probability as 


p{M„Mh) = 


Ige djV(M,|Mfe) dn 
MtUg dlgM* dMh' 


(16) 


To conform to the traditional HOD notation, we hereafter refer to 
dA''(M*|Mh)/dlgM* simply as {N{M».\Mh)) by implicitly as¬ 
suming A Ig M, = 0.02 throughout the paper. Given p[Mt , Mh), 
we can use Equation © to specify a single HOD for each of the 
cHOD samples. For the iHOD analysis, however, we also need to 
compute the parent stellar mass function 


$(M.) = n. 


r-\-oo 

' / ^ 

^0 




(17) 


and extract the observed stellar mass function $obs{M,) directly 
from the data. The distribution of host halo mass for galaxies at 
fixed stellar mass is simply 

and we can obtain the HODs for individual redshift slices 
within each iHOD sample using Equation Gl- Since the redshift 
range (0.02-0.30) spans ~3 billion years during which only 8% 
of the total stellar mass observed today formed (cf. equation 15 in 
|Madau & Dickinson|2014^ , we assume to be constant 

with redshift. Therefore, all the redshift evolution in the theoretical 
model comes from the cosmic growth in the halo mass function. 
To speed up the calculation without loss of accuracy, we adopt the 
same halo mass function for all the redshift slices within each sam¬ 
ple, calculated at the volume-averaged redshift of that sample. We 
have tested this approximation by comparing the predicted signals 
with those computed from integrating the halo mass functions over 
all the redshift slices and the difference is negligible. 


4.3 Parameterizing {A(M,|Mh)) 

Our analytic model for {N{Mt,\Mh)} has two components: 1) the 
mean and the scatter of the SHMR for the central galaxies, the com¬ 
bination of which automatically specifies (Wcen(AT*|M(,)), and 2) 
the mean number of satellite galaxies with stellar mass M* inside 
halos of mass Mh, (Wsat(AF*|Mh)). We adopt a similar param¬ 
eterization for the two components as in Lll, but allow more of 
them to vary during the constraint, and predict the signals differ¬ 
ently. Here we will briefly describe the functional forms and the 
model parameterizations. 

At fixed halo mass, we assume a log-normal probability 
distribution for the stellar mass of the central galaxies, with a 
log-normal scatter. The mean SHMR is then the sliding mean 
of the log-normal distribution as a function of the halo mass, 
/sHMR=exp{lnM*(Mh)). The Lll func tional form for /shmr 
that we adopt for our analysis is defined by |Behroozi et al.|j2010| l 
via its inverse function, 

Mh = Mim'^ f 1 -.V - 

where m = Among the five parameters that describe 

/shmr, and are the characteristic halo mass and stel¬ 

lar mass that separate the behaviours in the low and high mass 
ends (/shmr(Mi) = lnM*,o). The inverse function starts with a 
low-mass end slope /3, crosses a transitional regime around (M*,o, 
Ml) dictated by 7, and reaches a high-mass end slope p -\- 5. The 
Figure 1 in Lll illustrates the different responses of /shmr to the 
changes in each of the five parameters. 

The log-normal scatter at fixed halo mass is the quadratic sum 
of the intrinsic scatter and the measurement error. It is important 
to keep in mind that the intrinsic part of this scatter must be the 
same for all studies, while studies with different datasets or stel¬ 
lar mass determination methods may have differing measurement 
error contributions and thus a different total log-normal scatter. 
L12 considered two models for the scatter, one that is constant and 
another that includes an empirical stellar mass-dependence in the 
measurement error across the entire mass range. They found that 
since the constraint on the overall scatter is primarily driven by the 
high mass end where the slope of /shmr is much shallower, the 
mass-dependence of the scatter at the low mass end has little im¬ 
pact on their results. In light of this finding and to focus on the 
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Figure 3. Left Panel: The 2D HOD, i.e., the average number of galaxies per dex in stellar mass within halos at fixed mass, predicted by the best-fit iHOD 
model. Right Panel: The probability distributions of the host halo mass for galaxies at three fixed log-stellar mass listed on the top right. Thick solid, thin 
solid, and thin dashed lines represent contributions from all, central, and satellite galaxies, respectively. Galaxies of higher stellar mass are more likely to be 
central galaxies sitting in high mass halos, whereas low mass galaxies are as likely to be satellites of high mass halos as centrals of their own halos. 


scatter at the high mass end, we keep the scatter independent of 
halo mass below Mi, but allow more freedom in the scatter above 
the characteristic mass scale, with an extra component that is linear 
in IgMh'. 


O'lnM, {Mh) 


O^lnA/., Mh < Ml 

Cln M. + Ig , Mh^ Ml 


The motivation behind this additional degree of freedom is 
two-fold: 1 ) the average measurement uncertainty of the stellar 
mass estimates decrease with M* (hence Mh) in the MPA/JHU cat¬ 
alogue, and 2 ) recently, there is evidence suggesting a smaller scat¬ 
ter at the high mass end of the SHMR (e.g., |Shankar et al.|2014| (, 
although in Equation l |20[ ( the scatter is allowed to either increase 
or decrease with halo mass. The combination of the mean and the 
scatter fully specifies the HOD of central galaxies. 


{N^,n{M,\Mh)) = 

1 

- 1= exp 

O-ln M,{Mh)V2TV 

Follow Lll, we model (A^sat(M*|Mij)) as the derivative 
of the satellite occupation number in stellar mass-thresholded 
samples, (A^sat(> M^:\Mh)}, which is parameterized as a 
power of halo mass with an exponential cutoff and scaled to 
{Ncen{> Mt\Mh)) as follows, 

M,\Mh)) = 

< 22 , 


[InM* — ln/sHMR(M(,)] 


M* (Mh) 


( 21 ) 


simple power law functions of the threshold stellar mass, so that 


Msat _ „ / /sHMr(-^*) \ 

1012 /i-2Mo yiO^^h-^MQj 


(23) 


and 


Mcut _ „ / /shmr(-^*) a 

1O12/1-2M0 \l0^^h-^MQj 


(24) 


respectively. In practice, we choose a 0.02 dex bin size in stellar 
mass for the numerical differentiation of {Naa.t{> Mt\Mh)}- 

The left panel of Figuredisplays the {N{Mt,\Mh)) map 
predicted by the best-fit iHOD model in our analysis. The SHMR 
of the central galaxies is clearly seen as the “main sequence” en¬ 
veloping the “cloud” occupied by the satellite galaxies. The right 
panel shows the probability distribution of host halo masses for 
galaxies at M* = lO^°/i“^M 0 (blue), (green), and 

10^^h~^M q (red), each computed from the map in the left panel 
using the combination of Equations jl 6 | l, l |17[ (, and l |18[ >. The thin 
solid and the dashed curves of each colour show the contributions 
to the total probability distribution from the central and the satellite 
galaxies, respectively. The central galaxy contribution shows an in¬ 
creasing logarithmic scatter in halo mass with stellar mass, mainly 
due to the change of slope in the SHMR when going to the high 
mass end (where crinM, shrinks slightly as well, i.e., r; < 0). The 
lower mass satellites, however, populate halos of much greater di¬ 
versity than their high mass counterparts. Meanwhile, the satellite 
fraction increases toward lower stellar masses, but the total number 
of galaxies is dominated by central galaxies at all stellar masses. 


4.4 Spatial Distribution of Galaxies within Halos 


Instead of fixing ctsat to be 1 as in LI 1, we allow it to vary during 
the fit. We parameterize both the characteristic mass of a single 
satellite-hosting halo, Msat, and the cutoff mass scale, Mcut, as 


In addition to the parameterization of {A^(M*|Mf,)), we also need 
to model the spatial distribution of galaxies within dark matter 
halos for the small-scale clustering and lensing predictions. We 
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assume the isotropic Navarro-Frenk-White (NFW: |Navarro et al.| 
|1997| l density profile for halos with a concentration-mass relation 
Cdin{Mh) calibrated from simulations (described further below). 
Although the NFW density profile is regarded as “universal” only 
in pure dark matter simulations where the effects of baryons are 
absenj^ recent studies of nearby rich clusters found that, despite 
the fact that the baryons tend to flatten the dark matter distribu¬ 
tion in the cluster centre, the total matter density (i.e., baryon and 
dark matter combined) still maintains a NFW shape from the scales 
of the central brightest cluster galaxy (BCG) out to the virial ra¬ 
dius ( [Newman et al.|[2013[ >, probably due to the significant mix¬ 
ing between stars and dark matter as a result of frequent merg¬ 
ers ( jLaporte & White|2014^ . For smaller systems like the group and 
galaxy-scale halos with Mh<W^*h~^ Mq, however, the sum of an 
NFW and a stellar mass component is required to explain the inner 
slope of the observed total matter density profiles ^Mandelbaurt^ 
jet al.||20d^ [Newman et al.|[20T5) . In light of these observational 
findings, we add a model-independent stellar mass component as a 
point source in the halo centre to the g-g lensing predictions for all 
the galaxy samples, so that 

AZjatellar — - TT (^7) 

Trrj 


where (M*) is the average stellar mass of each stellar mass sam¬ 
ple. Although this extra stellar component is not necessary for the 
central galaxies of very rich clusters, i.e., some galaxies within the 
highest stellar mass sample, the minimum fitting scale for that sam¬ 
ple is above 0.1/i“^Mpc (see Appendix | a}, where the stellar con¬ 
tribution to AE calculated via Equation j25[ ( is negligible. 

Central galaxies are placed at the centres of the NFW ha¬ 
los. For the modelling of Wp, we do not consider the miscenter- 
ing effect, which is likely to be important only for around 30% 
of the BCGs jGeorge et al.[[20T^ . In our analysis the galaxy 
clustering for BCGs is measured down to ~0.2/i~^Mpc, there¬ 
fore largely imm une to the miscenter ing effect which has a ker¬ 
nel ~75/i“'^kpc George et al.||2012 i. It is worth noting that the 
miscentering issue in our case is more benign than in other recent 
papers that have explicitly modelled it (e.g.,|Miyatake et al.|2013| 
[More et al.[[2014f . In those papers, the samples that were being 
modelled had strict colour and luminosity selection, such that the 
central galaxy in the halo might not be present in the sample. In our 
case, with a flux-limited sample that goes to a relatively low flux 
limit, the central galaxies in the vast majority of group- and cluster- 
size halos should be present in the sample, so we only have to con¬ 
tend with small offsets of the central galaxy from the halo centre 
(rather than a complete misidentification of the central galaxy). For 
the g-g lensing, the miscentering effect is compounded and some¬ 
what cancelled by the contribution from the subhalo of satellite 
galaxies ( [Yoo et al.|2006^ , which is more important for low stel¬ 
lar mass samples. We will discuss the modelling of the combined 
effects further in Section lYSl 

We assume an NFW profile for the satellite distribution as 
well, but with a different amplitude of the concentration-mass 
relation than the dark matter. In particular, we set Caat(Afh) = 
fc X Cdin{Mh), where /c characterises the spatial distribution of 
satellite galaxies relative to the dark matter within halos. 

To summarise, we have in total 13 model parameters. 
Among them {IgIgM* ,/3, 5, 7 } describe the mean SHMR, 


However, see [Gao et al.[[2008) and[Dutton & Maccio {2014} regarding 


potentially better “universality” when using the 


Einasto 


I1965| profile. 


{f3aat,/3sat,i3cut,/3cut,asat} describe the parent HOD of satel¬ 
lite galaxies, {crin m, , 7 } describe the logarithmic scatter about the 
mean SHMR, and fc is the ratio between the concentrations of the 
satellite distribution and the dark matter profile. 


5 PREDICTING SIGNALS FROM THE HALO MODEL 


5.1 Prerequisites and Approximations 

The goal of our analysis is to infer the SHMR of central galax¬ 
ies and the HOD of satellite galaxies using the galaxy clustering 
and the g-g lensing. From the inferred {N{M,\Mh)}, we can pre¬ 
dict the parent stellar mass function to compare to those empiri¬ 
cally reconstructed from the Irinax method. This route is exactly the 
reverse of the methodology employed in the SHAM studies (e.g., 
MIO), which infer the SHMR by abundance matching to the Knax- 
estimated SMF and then predict the clustering and/or lensing as a 
cross-check. 

To facilitate a direct comparison with the SHAM results, we 
adopt the same flat ACDM cosmology model as in MIO (listed 
in Section [TJ. For the linear matter power spectrum, we use the 
low-baryon transfer function of Eisenstein & Hu (1998), which 
is a good approximation to the full transfer function on scales 
well below the BAG scale. To compute the non-linear matter cor¬ 
relation function we use the prescription from [Takahashij 

[et al. 


jTOlAl (an updated version of the halofit prescription 


from Smith et al.[[2003) ( to generate the non-linear power spec¬ 
trum for Fourier transforming to ^mm- The halo mass function and 
the halo bias function are from [Tinker et al.[ j2008t and [Tinker[ 
[et al.[ [2010[ l, respectively. The halo mass-concentration relation¬ 
ship Cdm(Afh) is from the fitting formula of Zhao et al. (2009i, 
which accurately recovers the flattening of halo concentration at 
high masses. (In DR7 we do not expect an upturn, which only 
shows up at redshifts beyond 1; see [Prada et al.[20r^ . 

For the cHOD samples the signal prediction is relatively 
straightforward, as only one HOD is required to describe each sam¬ 
ple. We describe the prediction of Wp and AE from a single HOD 
in Section [5^ Complexity arises when predicting the signals for 
the iHOD samples, each of which contains multiple HODs that de¬ 
scribe the stratification of galaxy populations due to the redshift- 
dependent selection function. We divide each iHOD sample into 
multiple redshift slices of Az = 0.01, which corresponds to a co¬ 
moving width of ~ 30h~^Mpc across the redshift range of the 
MGS. This redshift bin size is chosen to make sure the slices are: 

1 ) thin enough for the redshift selection function to be uniform, and 

2 ) thick enough to include both the one and two-halo terms in the 
LOS direction internally. Within each redshift slice i we predict the 
clustering and lensing signals, Wp and AE*, in the same way as in 
the cHOD analysis, and combine them to obtain the final predictions 
for the full sample. 




2 i 

Wr, 




and 


AE = E NiW^AT.' / E 


(26) 


(27) 


where Ni is the number of observed galaxies in slice i and Wi is the 
per-galaxy lensing weight at each redshift. Wi is essentially related 
to the inverse variance weights wis defined in Equation but 
integrated over the source redshift distribution. It also includes the 
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Figure 4. Contributions from individual redshift slices to the predicted clustering (left) and tensing (right) signals of the iHOD sample with 
lgM, = [10.2,10.6]. In each panel, the top sub-panel compares the measured signal (circles with en'orbars) to the predicted total signal, decomposed into 
the contributions from each of the 11 redshift slices (colour-coded by the colourbar on the right). The dashed curve in each stack marks the maximum red- 
shift (0.09) of the cHOD sample with the same stellar mass range, indicating a >^40% gain in the total signal by switching from cHOD to iHOD. Due to 
the difference in the weighting schemes, the fractional contributions of the same redshift slice to the total signals are different in the two cases. The bottom 
sub-panels show the ratio of the signal predicted for each slice over the signal for the whole sample. In both the clustering and lensing cases, the fractional 
variation is coherent with redshift and below 25% on all scales. 


geometric factor (l/E^) as well as the fact that an annular bin with 
a fixed centre value of Vp and width Avp contains more source 
galaxies at lower redshift, due to its larger area on the sky. Since 
the source catalogue is the same for the g-g lensing in each redshift 
slice, the pair counting weight for AE' is cxNi instead of ccN^. 

Figure|^illustrates the fractional contributions of each individ¬ 
ual redshift slice to the predicted total signal (calculated from the 
two equations above; top sub-panels), and compares the fractional 
variations among the predicted signals of all redshift slices (bottom 
sub-panels), using the iHOD sample with lgM* = [10.2,10.6] as 
an example. The dashed curve in each top sub-panel indicates the 
maximum redshift (0.09) of the corresponding cHOD sample with 
the same stellar mass range. Clearly, the three extra redshift slices 
above a = 0.09 included by the iHOD analysis contribute ~40% 
of the total signal from the 11 slices, because of the much higher 
weights associated with those higher redshift slices. 

Since we have used a line-of-sight integration limit of 
60/i“^Mpc (roughly twice the size of the slice width) to get Wp 
from the measured Wp signal will include galaxy pairs that strad¬ 
dle different slices. This effect is not directly reflected in Equa¬ 
tion l |26| l, so the cross-correlation between different slices might 
require a separate treatment. However, because the galaxy corre¬ 
lation functions vary very slowly and smoothly between adjacent 
slices (a few per cent on large scales; see the bottom sub-panels 
of Figure |^, the cross terms are most sensitive to the product of 
the numbers of galaxies in each pair of slices, which is correctly 
accounted for in Equation l |26| l. 

In particular, for two slices i and j. Equation \26) makes 
the assumption that the 3D cross-correlation is approximately 


Cj)/(2A'^iA(, ), while the more correct form should be 
VMj, as the cross-correlation is close to the geometric mean of 
the two auto-correlation functions and ( |Zehavi et al.|20ir| >. 
When is small, the fractional difference between the two 

forms is of the order — Ni)^/{NiNj) = {AN/N)'^. More¬ 

over, after integrating ^ via Equation in, the Wp signal is always 
dominated by the values at r-n < 30fe“'^Mpc at any fixed rp. We 
also construct a mock galaxy sample with a similar redshift distri¬ 
bution and bias evolution as the stellar mass samples in the data, 
and compare the projected correlation functions measured from 
our approximation in Equation l |26| l to that directly from the pair 
counting described in Section o We find that the error induced 
by adopting Equation \26\ is no more than 2% on all scales and is 
thus negligible in our analysis. 

It is important to note that the theoretical predictions for Wp 
and AE via Equation l |26| > and \21) are independent of the total 
number of observed galaxies in each sample, i.e., the normalisation 
of the observed stellar mass function — if we analyse only half 
of the galaxies randomly drawn from the entire bright 0 sample, 
the prediction from the same set of model parameters would not 
change, and the constraints on the model parameters would stay 
the same, albeit with larger uncertainties. 


5.2 Predicting Wp and AE from a single HOD 

The analytic model for deriving the galaxy clustering and the g-g 
lensing signals from a single HOD is based on the prescriptions 
given in a series of papers, including Tinker et al.|p005] l, |Zhen g ^ 
[Weinberg] ( |2007| l, and |Yoo et al.H2006^ , with improved treatment 
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of the scale-dependent bias, halo exclusion effect, over/under¬ 
concentrated satellite distributions from earlier works (e.g.|Berlind| 
|& Weinberg|2002||Guzik & Seljak|2002[|Mandelbaum et al.|2006| >. 
We describe the prescription briefly below and refer the interested 
readers to the aforementioned three papers for details. 

The signals of Wp and AE are obtained by projecting the 3D 
real-space galaxy auto-correlation function ^gg and the galaxy- 
matter cross-correlation function ^gm, respectively. The projection 
of ^gg to uip is given by Equation ID, while for the g-g lensing it is 
via 

AE(rp) = (S(< Tp)) - E(rp), (28) 

where 

/ + 00 

[1 + Cam (fp, f^)] dr ^, (29) 

- CO 

and 

(E(< rp)) = ^ / rpE(rp)drp. (30) 

fp Jo 


mass function, 

DD^\r) = Vdrj^ (r|ct) 

+ DDsi,t,t{Mh)FLt,t {r\cg, ct)] dMh, (34) 

where DDcen.t and DDsat.t are the total numbers of the cen-f and 
sat-f type of pairs expected within a halo of mass Mh, respectively, 
and Fcen,t{r) or Fsat.t (r) is the cumulative probability distribution 
of the numbers of each of these two pair types within that halo (al¬ 
though Equation j34| l requires their respective differential forms, 
described further below). For ^gg, 

FDcen,g(Mh) = {Ncen{Mh)Naa.t{Mh,)) , (35) 

and 

DDaa^tAMh) = (iV.at(Mh)(Wsat(M;.) - 1)) /2, (36) 

while T'cen, 9 (f) and FL^ g{r) are the galaxy NEW profile of con¬ 
centration Cg (Mh) and the convolution of that galaxy NEW profile 
with itself, respectively. Similarly for ^gm, 

DDcen,m{Mli) = (NceniMh)) Mh, (37) 


We are ignoring the effects from the radial window, which is broad 
enough that it is not relevant at galaxy scales jBaldauf et al.|201^ . 
The formalisms for predicting ^gm and ^gg are very similar, as the 
satellite distribution is merely a discrete realisation of NEW tracers 
with a different concentration than the dark matter. There is, how¬ 
ever, one extra contribution to and AE at small scales from 
the matter retained by the subhalos that host the satellite galax¬ 
ies (beside the model-independent stellar mass component; see 
Equation 1 25 [l. We will discuss this “subhalo” lensing term in more 
detail in Section [53] and focus on the similar terms shared by ^gg 
and ^gm here. 

Let us consider a general scenario in which the correlation 
is between a primary galaxy catalogue and a secondary catalogue 
consisting of tracer particles t, whether it be the dark matter (^gm) 
or the same galaxies as the primaries (^gg). The correlation signal 
between the primary and the secondary can be decomposed into 
two components. 


?gt(f) -t- 1 — + l] + [?gt'(f) + ij , (31) 


where “Ih” and “2h” refer to the so-called “one-halo” and “two- 
halo” terms, respectively, and the 4-1 following each ^gt term is 
to relate the correlation piece to the corresponding number counts. 
The “Ih” term can be theoretically derived via the simple [Feeblesj 
|& Hauser1jl974[ > estimator, 


^g "(0 + 1 


DD^^{r) 

RR{r) 


(32) 


The RR{r) term is the expected number of pairs consisting of 
randomly-distributed galaxies and t particles separated by distance 
between r and r -|- dr. 


RR{r) = g{4nr^dr)ngptV, (33) 


where rig and pt are the mean densities of sample galaxies and t 
particles, respectively, and V is the survey volume. The prefactor g 
is 1 if f is a different species than the primary galaxies (^gm, with 
pt = pm), and 1/2 if the two are identical (^gg, with pt = fig). 
The DD(r) term requires separate treatment of central and satellite 
galaxies within each halo first, and then integration over the halo 


and 


DDaa.t,m{Mh) = ((V.at(Mft)) Mh, (38) 

while i/^en,m(f) and T/at,m(t') are the dark matter NEW profile 
of concentration Cdm{Mh) and the convolution of that dark mat¬ 
ter profile with the galaxy NEW profile of concentration Cg{Mh), 
respectively. We use the analytic formula in [Zheng & Weinberg] 
( |2007^ for the convolution of two NEW profiles, either with the 
same or different concentration parameters. All the F{r) functions 
are normalised so that F{r = 2r2oom) is unity, i.e., we assume 
both the satellite galaxies and dark matter are contained within the 
virial radii of their host halos. Recently, the study by [van Daalenj 
|& Schay^ j2015^ suggested that it is important to account for the 
matter outside halos when estimating the small scale matter power 
spectrum using the halo model. However, this missing power prob¬ 
lem does not exist for predicting galaxy clustering and g-g lens¬ 
ing, as the (sub)halos occupied by galaxies are generally massive 
enough to be fully accounted in the calculations. 

The “two-halo” terms are relatively straightforward to calcu¬ 
late above the halo exclusion scales (rex~3/i~^Mpc, described 
further below), via 

fgg{r) = bl(^{r)^mmir), (39) 

and 


= bgC{r)r^cir)^mmir), (40) 


where bg is the galaxy linear bias, ^(r) characterises the frac¬ 
tional scale-dependence in the large scale halo bias, and rcc(f) is 
the cross-correlation coefficient between the galaxies and the dark 
matter on relevant scales, defined by 


= (f) = 


^gm(r) 


(41) 


y/^gg(‘^)^rnm(r') 

To a good approximation rcc(r) is close to unity on scales above 
jOuzik & Seljak|[200T] [Weinberg et al.|[2054j [Baldauf et al.j 


20101 , and so we set rcc(r) = 1 and use the empirical fitting 
function of ((r) from Tinker et al. i 2005 1 . We computed bg as the 


galaxy occupation number-weighted halo bias. 


bg 


— 1 
= ng 



dn 

dAR 


dMh, 


(42) 
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where b{Mh) is the halo bias function (see the sixth column of 
Table[T|for the average linear bias of each iHOD sample). 

Within the halo exclusion regime ( [Cacciato et al.|2009} , how¬ 
ever, Equations l |39| l and l |40[ > are no longer valid because the cen¬ 
tre of one halo cannot sit within the virial radius of another halo, 
i.e., rex< 2 xr 2 oom 5 where rJoom is the maximum halo radius in the 
calculation. Therefore, the “two-halo” terms at a given distance re¬ 
quire explicit integration over pairs of halos that are too small to 
run into the centre of each other when separated by that distance. 
We follow the prescription described in [Tinker et al.| ( 2005J^^or 
the treatment of halo exclusion and adopt the method in Yoo et al.| 
( |2006^ to circumvent the issue of unsatisfied integral constraints in 
the halo mass function and halo bias function (see Equation 16 of 
|Yoo et al.|2006| for details). 


5.3 Subhalo Contribution to AS in MassiveBlack-II 


The extra “subhalo” lensing term, as mentioned in Section |5.2| 
arises because the satellite galaxies are sitting at the local den¬ 
sity peaks (i.e., their subhalos), rather than random positions within 
their main halos. From investigating a suite of N-body cosmologi¬ 
cal simulations, [Mandelbaum et al.H2005| l found that the fractional 
contribution of the subhalo term on scales below 0.1fi“^Mpc is 
roughly equal to the satellite fraction within the sample, but is cut 
off on slightly larger scales because of the tidal truncation of those 
subhalos inside larger halos. They discovered that a truncation of 
the subhalos at ~0.4 times the virial radii gives a good match to 
the lensing signal measured from the simulations. A subsequent 
exploration by Yoo et al. (2006), however, estimated that the er¬ 
ror induced by ignoring the subhalo contribution is below 10 % on 
the relevant scales using a suite of relatively small Smoothed Parti¬ 
cle Hydrodynamic (SPH) simulations. It is still controversial as to 
whether the modelling of the subhalo lensing is necessary. While 
some studies ( [Velander et al.[20l4{[Hudson et al.[2015[l adopted the 
“tidally stripped” subhalo lensing model proposed in |Mandelbaum 
j200^, other studies (e.g., [Leauthaud et al.|2()12^ ^ Coupon 


et al. 


et al. 


2015 I ignored the lensing from subhalos, either based on the 


result of Yoo et al. (2006), and/or the fact that the measured total 
lensing signal is much less constraining than other probes used in 
their papers (SMF and galaxy clustering). 

We revisit the subhalo lensing iss ue by employing t he 
MassiveBlack-lj^SPH simulation (MB-II; Khandai et al.]2014 1 , a 
P-GADGET cosmological hydrodynamic simulation evolved with 
a total of 2 X 1792® dark matter and gas particles (mass res¬ 
olution ~ several times lO®/i“®M 0 ) inside a cubic volume of 
100®/i®Mpc“®. MB-II is one of the highest resolution simulations 
of this size which includes a self-consistent model for star for¬ 
mation, black hole accretion and associated feedback. Thanks to 
its exquisite capability of resolving subhalos down to 
with realistic baryonic effects (for the details on the HOD of simu¬ 
lated galaxies in MB-II, see Tucker et al, in prep), we quantify the 
impact of the often-ignored subhalo lensing term, by comparing 
the g-g lensing signals measured from the MB-II galaxies with sub¬ 
halos and from a sample of mock galaxies without. Following [Yoo[ 
[et al.[ ( [2006l ), in order to construct a subhalo-less mock galaxy sam¬ 
ple, we identify the satellite galaxies in each main halo, and ran¬ 
domise their position angles relative to the halo centre while keep¬ 
ing their halo-centric distances fixed. In this way, we ensure that 
the only difference between the lensing signals measured from the 


^ http;//mbii.phys.emu.edu/ 


MB-II galaxies and the mock galaxies on small scales is induced 
by the presence vs. absence of dark matter within the subhalos. 

Figure shows the results of this experiment in six differ¬ 
ent stellar mass bins at 2 = 0 . 06 , with M* increasing from left to 
right and top to bottom. In each top sub-panel, the satellite frac¬ 
tion and the stellar mass range are marked on the top. The open 
circles with errorbars indicate the measurement for the total g-g 
lensing signal for that stellar mass bin, decomposed into the cen¬ 
tral (red solid) and satellite (blue solid) terms. The errorbars are 
derived from jackknife resampling of the simulation volume. The 
cyan dashed line shows the satellite term measured from the mock 
galaxies, which decreases rapidly on small scales due to the lack of 
subhalos. The magenta dashed line is a crude estimate of the sub¬ 
halo lensing term via re-scaling the amplitude and truncating the 
central term (red solid) at 0.4r2oom. The amplitude is re-scaled by 
/sat/(I — /sat), assuming the subhalos that host satellites share 
the same inner matter density profile as those host central galaxies 
of the same stellar mass, analogous to the ansatz employed in the 
SHAM technique. The black solid line is the sum of this estimated 
subhalo term, the directly measured central term, and the satellite 
term from the mock sample, serving as a rough estimate of the pre¬ 
dicted total signal with the subhalo lensing contribution, whereas 
the black dashed line is the sum of central and mock satellite terms 
without any subhalo contribution. The difference between the open 
circles and the black dashed lines represents the magnitude of the 
error on g-g lensing due to ignoring the subhalo lensing piece, and 
the difference between the open circles and the black solid lines 
represents the deficiency of the overly-simplified “re-scaled cen¬ 
tral” subhalo lensing model in over-predicting or under-predicting 
the signal at various scales. 

The effects of ignoring the subhalo lensing term are better il¬ 
lustrated by the bottom sub-panels of Figure|^where the black solid 
and dashed lines shown above are divided by the measured total 
signals. The gray shaded region indicates the typical uncertainties 
on the ratios propagated from the measurement uncertainties on 
AS. The ratio plots clearly demonstrate that, ignoring the subhalo 
lensing term causes 15%-35% systematic under-prediction of the 
total signal on scales below ~ 0.2fi“®Mpc, and the deviation is 
proportional to the fraction of satellite galaxies in the sample. This 
contradicts the result from |Yoo et al.[p006[ l who found the effect 
generally below 10% at all radius. The opposite conclusions drawn 
from the two sets of simulations could be due to the drastic dif¬ 
ference in the resolution and size — MBII has 2000 times more 
particles in 8 times the volume of the simulation used by [Yoo et al.[ 
( [2006^ , and/or the average mass of the subhalos used in the two 
experiments (i.e., [Yoo et al.|2006| could only go to subhalos above 
lO®®/i“®M 0 , hence much lower satellite fraction). Once the sim¬ 
ple “re-scaled central” subhalo lensing term is added, the predicted 
signal agrees with the direct measurement within the uncertainties 
for the three high stellar mass bins in the bottom row, but over¬ 
predict the total signal by 15%-35% on the relevant scales for the 
three low stellar mass bins in the top row. This discrepancy be¬ 
tween the simple remedy and the simulation in the low mass sam¬ 
ples is caused by the enhanced tidal attenuation effect on the low 
mass subhalos from their host halos, besides the usual tidal trunca¬ 
tion effect seen for subhalos of all masses. This attenuation effect 
is consistent with the findings in [Li et ^ ( [2014^ , who measured 
the subhalo lensing signal using the satellite lens galaxies selected 
from the SDSS group catalogue constructed from a redshift-space 
Friends-of-Friends algorithm ( [Yang et al.|2007[ l. They found signif¬ 
icant AS signal below projected radius ~ 0.1/i~®Mpc for satellite 
lenses located 0.1-0.5/i“®Mpc away from their respective group 
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Figure 5. The g-g lensing signal of six different stellar mass samples in the MB-II simulation at 2:=0.06, highlighting the importance of including the subhalo 
contribution in the modelling of the surface density contrast AS. In each upper sub-panel, open circles are the total AS signal measured directly from the 
simulation, which is the sum of the red and the blue solid curves, indicating the contributions from the centrals and the satellites, respectively. The cyan 
dashed curve represents the satellite contribution measured after randomising the position angles of the simulated satellites (distances to the halo centre kept 
fixed). The magenta dashed curve accounts for the subhalo contribution by re-scaling the central contribution according to the satellite fraction listed on the 
top left and truncating the 3D density profile at 0.4r200in- The black solid curve is the sum of the subhalo (magenta), the central (red), and the randomised 
satellite (cyan) curves, and the black dashed curve the sum of only the latter two. The ratio of the black solid and the black dashed curves to the open circles 
is shown on each lower sub-panel. Ignoring the subhalo contribution results in an underestimate of the total signal by 15%-35% on small scales, roughly 
proportional to the satellite fraction in each sample. For the three low mass samples, using the re-scaled and tmncated central curve as a rough estimate for 
the subhalo contribution overestimates the total signal by 10%-35% on small scales, and by ~ 15% in the transitional regime, albeit within the measurement 
uncertainties indicated by the gray band. For the high mass samples the black curves agree with the direct measurements within the uncertainties across all 
scales. 


centres, and the amplitude of the subhalo lensing signal can be ex¬ 
plained by a truncated and attenuated version of those NFW halos 
that host central galaxies of the same stellar masses. 

Drawing from these findings in the MB-II simulation and the 
SDSS groups, we model the subhalo contribution to the galaxy- 
matter cross-correlation function as an attenuated and trun¬ 
cated version of the central term 


CtirlM,) 



/sat 

l-/sat 




r <rt 
r ^ rt. 


(43) 


where ft and rt is the attenuation factor and the truncation radius, 
respectively. We adopt a truncation radius of rt=0.4r2oom((\^*), 
according to the findings in |Mandelbaum et^ ( |2006t and Figure]^ 
The value of ft should be stellar mass dependent — we adopt ft 
values of 0.5 and 1.0 for stellar mass samples below and above 
10 ^°/i“^Mq, respectively, informed by the results in Figure]^ 


Flowever, as mentioned in Section [4^ the small-scale lensing 
modelling is further complicated by the miscentering effect, which 
is absent from Figure]^ As pointed out in |Yoo et al.H200^ , the 
miscentering effect smooths the overall AE signal on small scales 
with akernel of length ~ 0.1/i“^Mpc, and is likely to have alarge 
impact for the higher stellar mass bins, i.e., within groups and clus¬ 
ters. The smoothing effect will effectively reduce the predicted sig¬ 
nal for the central galaxies at the 10%-15% level on scales below 
0.1/i“^Mpc, where the subhalo lensing effect operates in the op¬ 
posite direction. Since the satellite fraction at the high mass bins 
is ~ 20%, this reduction can be effectively absorbed by a low ft 
value of ~ 0.5 in Equation j43| l for the high mass bins. Therefore, 
to simultaneously capture both the effects of subhalo lensing at low 
Mt and miscentering at high M* on scales below 0.1/i“^Mpc, we 
adopt a constant attenuation factor of ft = 0.5 in our analysis, re¬ 
gardless of the stellar mass of the sample. We have tried different 
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values of ft = 0.5 ± 0.2 and found that our results are robust to 
those changes, due to the relatively large statistical uncertainties of 
the g-g lensing measurements on small scales (see Figure]^. 

For each redshift slice within the iHOD samples and for each 
individual cHOD sample, the predictions for the Wp and AE sig¬ 
nals are obtained by projecting ^gg and to 2D, according to 
Equation and l |28^ , respectively. We adopt the same = 

60h,“^Mpc for the theoretical predictions of Wp as for the mea¬ 
surements. Finally, for each IHOD sample we combine the predic¬ 
tions from all its redshift slices to obtain the predictions for that 
entire sample via Equations \2()\ and |27). 

Figure compares the predictions from our best-fit 
model (thick lines) to the measurements from data for the eight 
iHOD stellar mass samples. The galaxy clustering and g-g lens¬ 
ing results are shown in each pair of the top and bottom sub¬ 
panels, respectively. Each best-fit curve is decomposed into the 
two-halo (thin dot-dashed), the one-halo central (thin dashed), and 
the one-halo satellite (thin solid) contributions. For the best-fit g- 
g lensing signal of each sample, we also show the contributions 
from the subhalo lensing term and the “point source” stellar mass 
as the thin dotted and dot-dot-dashed lines, respectively. The sub¬ 
halo lensing term begins to dominate the one-halo satellite con¬ 
tribution at around half the virial radius of the average main halo 
corresponding to that sample, and then the stellar mass term takes 
over at the galactic scales below tens of /i“^kpc. We will come 
back to Figure]^ for the detailed comparison between the model 
fits and the measurements in Section lfi^ 

6 PARAMETER CONSTRAINTS 

6.1 Likelihood Model and Bayesian Inference 

Armed with the capability of predicting the galaxy clustering and 
g-g lensing signals for any given cHOD or iHOD sample, we can 
infer the posterior probability distributions of the model parameters 
from the measurements within a Bayesian framework, assuming 
a Gaussian likelihood model and a set of uninformative priors on 
those parameters. 

The observable vector in our likelihood model has two com¬ 
ponents: 

1. Wp{rp): Wp profile of stellar mass sample i measured 
at projected radius (from to 20.0/i“^Mpc), for i G 
{1 • • • 8} (iHOD) or G {1 • • • 6} (cHOD), and j G {1 • • • n}, 
where n is the number of data points used for each stellar mass 
samples. Ranked by ascending order in the average stellar mass, 
the samples have n G {17,16,15,14,14,13,13,10} (iHOD) or 
n G {16,15,14,14,13,10} (cHOD), due to the different fibre- 
collision induced cutoffs on small scales. There are 112 (82) data 
points in total for the iHOD (cHOD) analysis. 

2. AE*(rp): AE profile of stellar mass sample i measured 
at radius r^, ranging from to 20.0/i“^Mpc, where is 
the small-scale lensing cutoff caused by the systematic uncertain¬ 
ties in estimating the “boost factor” for bright samples (see Ap¬ 
pendix for more details). We use the same set of stellar mass 
samples as clustering in each analysis. The minimum fitting 
scale is 25/i“^kpc for the samples below lgM*=10.6, but in¬ 
creases as the sample gets brighter due to large reaching 
0.12fi“^Mpc for the highest stellar mass bin. Therefore, k G 
{1 • • • n'}, where n' G {18,18,18,18,17,17,16,15} (iHOD) and 
G {18,18,17,17,16,15} (cHOD). There are 137 (101) AE data 
points in total for the iHOD (cHOD) analysis. 


We model the combinatorial vector x of the Wp and the AE com¬ 
ponents as a multivariate Gaussian (W=256 variables in total for 
iHOD and W=190 for cHOD), which is fully specified by its mean 
vector (x) and covariance matrix (C). The Gaussian likelihood is 
thus 

C(x|9) = |CT''*exp(-iici)!:picil^. ,44, 
where 

/^sat 5 Bcut , dcut , f^sat, fc\ • (45) 

We adopt flat priors on the model parameters, with a uniform dis¬ 
tribution over a broad interval that covers the entire possible range 
of each parameter (see the 3rd column of Table [^. The final co- 
variance matrix C is assembled by aligning the error matrices of 
Wp and AE measured for individual samples along the diagonal 
blocks of the full N x N matrix. We ignore the weak covariance 
between Wp and AE (with the covariance being weak due to the 
fact that AE is dominated by shape noise), and between any two 
measurements of the same type but for different stellar mass sam¬ 
ples. 

In the parameter inference stage, the posterior distribution is 
derived using the Markov Chain Monte Carlo (MCMC) algorithm 
emcee ( [Foreman-Mackey et al.|[^13| >, where an affine-invariant 
ensemble sampler is utilised to fully explore the 13-D parame¬ 
ter space. For each MCMC chain of the iHOD and cHOD infer¬ 
ences, we perform 90, 000 iterations, 30,000 of which belong to 
the burn-in period for adaptively tuning the steps. To eliminate the 
tiny amount of residual correlation between adjacent iterations, we 
further thin the chain by a factor of 10 to obtain our final results. 

6.2 Posterior Probability Distributions 

Figure presents a summary of the inferences for both the 
iHOD (brown filled) and cHOD (gray open) analyses, showing the 
ID posterior distribution for each of the 13 model parameters (diag¬ 
onal panels), and the 95% and 68% confidence regions for all the 
parameter pairs (off-diagonal panels). In the panels of the lower 
triangle, we highlight the results from our fiducial model, i.e., the 
iHOD model, employing the clustering and g-g lensing from the en¬ 
tire galaxy population above the mixture limit and self-consistently 
accounting for the survey incompleteness in stellar mass. In those 
panels we also show the 95% confidence regions from the iHOD 
constraints that employ the clustering and the lensing signals sep¬ 
arately. The constraint from using the clustering alone (blue solid 
contours) is significantly tighter than that from the g-g lensing (red 
dashed contours), due to the much higher S/N in the Wp measure¬ 
ments. However, the g-g lensing is absolutely essential in the joint 
iHOD analysis, helping break the degeneracy between the scat¬ 
ter parameters (ain m, and rj) and the slope of the SHMR on the 
high mass end (5). In each panel of the upper triangle, we compare 
the constraints from the fiducial model (filled contours) to that of 
the traditional approach, i.e., the cHOD model, which is limited to 
54% of the iHOD galaxies and assumes the samples to be com¬ 
plete (open contours). While the two constraints are largely consis¬ 
tent with each other, the iHOD analysis is able to obtain a much 
tighter constraint because of the larger span in stellar mass range 
and the higher number of galaxies (hence the higher S/N of mea¬ 
surements illustrated in Figure]^. In particular, the iHOD analysis 
substantially improves the constraints on the pivot of the SHMR 
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Figure 6. Comparison between the galaxy clustering and g-g lensing measurements from SDSS with the signals predicted by the best-fit iHOD model, 
for the eight stellar mass-selected samples. For each sample, the top and bottom panels show the projected con'elation function Wp and g-g lensing signal 
AE, respectively. In each panel, the data points with errorbars are the measurements, and the thick line is the best-fit signal, which is decomposed into the 
2-halo (thin solid), the 1-halo central (thin dashed), and the 1-halo satellite (thin dot-dashed) contributions shown underneath. The dotted and the dot—dot- 
dashed lines in the lower sub-panels represent the lensing contributions from the subhalo dark matter and the galaxy stellar mass, respectively. The x-/y-axis 
ranges are uniform across all panels. The model provides excellent fit to both the clustering and lensing signals of galaxies over four decades in stellar mass. 


(IgMl, IgM*) and its high-mass end slope S. The five parame¬ 
ters that describe the satellite HOD benefit the most from the inclu¬ 
sion of low stellar mass samples, with much stronger constraint in 
iHOD than in cHOD. The 68% confidence regions of the ID poste¬ 
rior constraints are listed in Table 

The iHOD model favours a scatter of ainM, = O.SOto'oa 
about the mean SHMR with a slightly negative slope of 77 = 
—0.04 ± 0.02, while the cHOD model infers a constant scatter 
of cinM, = 0.42lQ'gg. However, the two constraints converge 
to a similar scatter on the high mass end, where the iHOD model 
shrinks its mass-dependent scatter to meet the lower constant value 
preferred by the cHOD model. While the cHOD does not constrain 
the concentration ratio /c well, the iHOD strongly favours the sce¬ 
nario where the satellite distributions are ~ 15% less concentrated 


than the dark matter within the same halos (/c = 0.86lg i^). This 
galaxy under-concentration agrees with observational findings in 
Wang et aT] (|2014) and [Budzynski et al.H2012) (but see |Watson| 


et al .|2012 for the degeneracy between fc and the inner slope of the 


cluster density profiles). 


The improvement on the overall constraints from iHOD to 
cHOD is better illustrated in Figure]^ where we translate the un¬ 
certainties of the individual model parameters to that of the mean 
SHMR and the satellite HOD separately. In the left panel, the red 
solid and gray dashed curves are the mean SHMR inferred from the 
iHOD and cHOD analyses, respectively. The width of each shaded 
band, i.e., the 68 % uncertainty in the mean log-stellar mass at fixed 
halo mass, directly reflects the joint posterior probability distribu¬ 
tion of the five parameters (Ig Ig M* ,/3, <5, 7 ) that determine 
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Figure 7. Confidence regions from our IHOD analysis of the galaxy clustering and the g-g lensing data in the 2D planes that comprised of all the pair sets of 
model parameters. Histograms in the diagonal panels show ID posterior distributions of individual parameters. For comparison, constraints from the cHOD 
analysis are only shown in the panels of the upper triangle. Contour levels run through confidence limits of 95% (light brown/gray) and 68% (dark brown/gray) 
inwards. The tighter iHOD constraint compared to cHOD is due to the combination of improved overall statistics from larger samples and additional information 
from the low mass galaxies. In the panels of the lower triangle, blue solid and red dashed contours indicate the 95% confidence regions of the two separate 
iHOD constraints using the clustering and the lensing data, respectively. 


the mean SHMR. The two constraints are consistent with each 
other, especially on the high mass end. This is very reassuring — as 
we mentioned in Section 144] the two models are identical to each 
other for the high stellar samples, because the completeness ap¬ 
proaches unity and is independent of halo mass. As expected from 
Figure]^ the uncertainties in the mean SHMR are greatly reduced 
across all halo masses in the iHOD constraint, with the most sig¬ 
nificant reduction happening above Mq (by factor 

of two in log-stellar mass). Furthermore, both of the mean SHMRs 
are best constrained at the intermediate mass range around the piv¬ 


otal point (Ig Ig M°), due to the highest model sensitivity and 
the highest S/N di this stellar mass. 

Similarly, in the right panel of Figure we show the con¬ 
straints on the satellite occupation number predicted for five of the 
stellar mass samples that were used in the cHOD analysis. The gray 
dashed lines show the median satellite occupation numbers as func¬ 
tions of halo mass inferred from the cHOD analysis, with their l-cr 
uncertainties shown as the corresponding gray bands. For compar¬ 
ison, we have predicted two types of satellite HODs for these five 
cHOD samples from the best-fit iHOD model, one is for the par- 
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Figure 8. Constraints on the SHMR for the central galaxies (left) and the satellite HDDs for four stellar mass bins (right), as a more visually appealing and 
scientifically informative presentation of the confidence regions shown in Figure|7] In the left panel, the red solid and the gray dashed lines show the inferred 
SHMR from the iHOD and cHOD analysis, respectively, and the shaded regions represent the 68% confidence range at fixed halo mass. In the right panel, solid 
coloured and dashed gray lines show the expected HODs for the observed galaxies from the best-fit iHOD and cHOD model, respectively, with each shaded 
band indicating the 68% confidence range at fixed halo mass; long-dashed coloured lines show the parent HODs predicted from the best-fit iHOD model. The 
difference between the solid and long-dashed coloured lines implies possible incompleteness in the stellar mass samples. 


Table 2. Description, prior specifications, and posterior constraints of the parameters in the model. All the priors are uniform distributions running across the 
entire range of possible values for the parameters, and the uncertainties are the 68% confidence regions derived from the ID posterior probability distributions. 


Parameter 

Description 

Uniform Prior Range 

iHOD 

cHOD 


Characteristic halo mass of the SHMR 

[9.5, 14.0] 

i2.iolg:iI 

1 9 09 + 0.29 

iz.oz_ 0,29 

IgAfO 

Characteristic stellar mass of the SHMR 

[9.0, 13.0] 

W.SltVol 

10.47l°:i? 

/3 

Low-mass slope of the SHMR 

[0.0, 2.0] 

Q OO + 0.21 

0.54t«J® 

<5 

Controls high-mass slope of the SHMR 

[0.0, 1.5] 

f) 42+0.03 
^•^^-0.04 

U-4^_0.09 

7 

Controls intermediate-mass behaviour of the SHMR 

[-0.1, 4.9] 

-| Q-| +0.18 

-‘-•^-‘^-0.20 

26 

-Ssat 

Normalises the scaling of Msat 

[0.01,25.0] 

8 . 98 +lf, 

11 

gg 

/3s at 

Slope of the scaling of Msat 

[0.1, 1.8] 

O-SOlHs 

n qc:+ 0.06 

U.»0_o.o5 

-Scut 

Normalises the scaling of Mcut 

[0.0, 6.0] 


r\ '^o + O.SS 

/3c ut 

Slope of the scaling of Mcut 

[-0.05, 1.50] 

0 41 +0.16 

0 fxQ+0-31 
U.OO_o 33 

Ctsat 

Power-law slope of the satellite HOD 

[0.5, 1.5] 

l-UU_o 02 

1 

-‘^•^'^-0.05 

<^111 M* 

Low-mass scatter in the SHMR 

[0.01, 3.0] 

U.OU_o 03 

U-4^_0.08 

V 

Slope of the scaling of high-mass scatter 

[-0.4, 0.4] 

-0.04l°;°2 

— D 01 

fc 

Concentration ratio between satellites and dark matter 

[0.1, 3.0] 


l-‘JO_o 20 


ent satellite galaxies (thin dashed lines), and the other for the ob¬ 
served galaxies (thick solid lines). To avoid clutter, we only show 
the uncertainty bands associated with the inferred observed satellite 
HODs, all of which are considerably narrower than the gray bands, 
especially for the two highest stellar mass samples. This improve¬ 
ment is very encouraging. As we will show later in Section [73| the 
occupation statistics of the satellite galaxies, long being regarded 
as a mere nuance in the HOD modelling of galaxy clustering and 
HOD-based cosmological constraints ( |Yoo & Seljak|201^ , is con¬ 
strained well enough in the IHOD analysis to provide important 
insights into the physical formation and evolution of the satellite 
galaxies. 

It is important to point out that, when calculating the HODs for 


the observed galaxies in Figure]^ we have made use of the actual 
number of observed galaxies in each sample in order to normalise 
the amplitude of (Agat [Mh)) using Equation l |12[ >, while the parent 
HODs are predicted directly from the best-fit model parameters us¬ 
ing Equation (|^. As mentioned in Section [5T| the IHOD analysis 
constrains the host halo distribution of the average galaxy in each 
sample by matching to the observed clustering and g-g lensing sig¬ 
nals, and is thus entirely agnostic of the observed amplitude of the 
stellar mass functions. Therefore, it is highly nontrivial to discover 
in the right panel of Figure]^ that, all the predicted parent HODs 
are lying right above their observed counterparts, consistent with 
our expectation that the galaxy samples above the mixture limit 
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are subject to some mild level of incompleteness. We will inspect 
closely this consistency later in Section [TTT] 

Figurej^compares the clustering and g-g lensing signals mea¬ 
sured from data to that predicted from the set of best-fit parameters 
derived from the median values of their corresponding ID posterior 
probability distributions. We refer to this set of “median” parame¬ 
ters as the “best-fit” parameters for simplicity, despite that it does 
not produce the maximum posterior probability or likelihood value. 
The lowest stellar mass sample is subject to severe cosmic variance 
due to small volume, resulting in a relatively poor Wp fit on large 
scales. The model also under-predict the small scale clustering sig¬ 
nal for the Ig M* = [9.4,9.8] sample, which is largely overrun by 
the CfA2 great wall ( |Geller & FIuchra|[l989| >. Overall, the model 
provides an excellent fit to the data over almost four decades in 
stellar mass, across distance scales from the galactic disks to tens 
of Mega-parsecs. 


rate at fixed M*, and 4) uncertainties in the combined treatment 
of the mis-centering and subhalo lensing. We have discussed 3) in 
detail in Section [40] and will comment more on its potential im¬ 
pact in Section |7.1[ and the impact from 4) has been discussed and 
addressed in Section |531 Therefore, here we focus on the first two 
issues and discuss their potential impacts on our model constraints 
in turn. 

As a variant of the ITOD formalism, the IHOD model inherits 
the systematic uncertainties associated with the generic HOD mod¬ 
els, which assume that the average galaxy content of halos depends 
solely on the halo mass, predicted by the basic excursion set the¬ 
ory of stracture formation jPress & Schechter|1974[[Bardeen et al.| 
|1986[[Bond etal.|1991^ . However, since the halo assembly histories 
in cosmological simulations are affected by the large-scale environ¬ 
ment jWang et al.|2007l [Dalai et al.|2008T >, some halo properties, 
including age, concentration, spin, richness, and most importantly 
clustering, exhibit systematic differences between low and high 


6.3 Systematic Uncertainties in the iHOD Model 

Gao et al.[2005|[Wechsler et al.[2006|[Harker et al.[2006|[Zhu et al.[ 

Observationally, the systematic errors in our fiducial iHOD anal- 

2006 

Hahn et al.|2007| Jing et al.|2007||Li et al.|2008|[Faltenbacher[ 

ysis come from the uncertainties in the stellar mass estimates and 

&W 

fite|2010[[Croft et al.[2012^. This effect, broadly termed “halo 


in the measurements of the projected correlation functions and the 
g-g lensing signals. As mentioned in Section \2/2\ the estimation 
of M* is subject to various photometric and model uncertainties 
that mostly affect the overall normalization of the derived stellar 
masses, with little perturbation in the ranking order of individual 
galaxies within the catalogue. Therefore, the inferred mapping be¬ 
tween the stellar mass and the dark matter halos from our analysis 
can be straightforwardly re-calibrated for other stellar mass estima¬ 
tors. 

The uncertainties in the Wp and the AE measurements are 
estimated internally from the data via jackknife resampling. Com¬ 
pared to external estimates derived from multiple independent cata¬ 
logues, the jackknife estimate somewhat overestimates the correla- 


galaxy formation histories as well, resulting in a “galaxy assembly 
bias”. However, hydrodynamic simulations and SAMs predict only 
a small impact (< 10%) of the halo assembly bias on galaxy cluster¬ 
ing statistics jYoo et al.|2006[ [Croton et al.|2007[|Zu et al.|2008| >, 
while observationally a smoking-gun detection of the galaxy as¬ 
sembly bias remains elusive (Berlind et al. 2006[|Blanton & Berlind| 


|2009[ >, while on large scales it likely underestimates the errors be 
cause it does not include the cosmic variance above the scales of 
individual subsamples. Therefore, the jackknife errors for the two 
lowest stellar mass/redshift samples are underestimated on large 
scales, because of the small physical size of the jackknife patches. 
The error budget of these two stellar mass samples, however, is still 
dominated by the uncertainties on small scales, so the fiducial con¬ 
straint is insensitive to the underestimation in the error covariance 
on large scales, despite the inadequate fit to the Wp measurements 
of the two lowest stellar samples on relevant scales. 

The systematic errors in the AE measurements have addi¬ 
tional contributions from the calibration biases mainly related to 
the shear estimation and the photo -2 errors, each at a few per 
cent level (see Appendix [A| f or the scale-dependent lensing sys- 
tematics). [Mandelbaum et ah] ( |2013|l addressed these system atics 
by conducting a suite of ratio tests ( [Mandelbaum et al.|2005| l, i.e. 
comparisons of the signal computed using the same lens samples, 
but with different sub-samples of the source catalogue. After ap¬ 
plying well-understood corrections, the systematic errors are sub¬ 
dominant compared to the statistical errors and compared to the 
systematic uncertainties due to modelling assumptions, which we 
will discuss next. 

The theoretical systematic errors in our fiducial iHOD analy¬ 
sis have four main sources: 1) simplified model assumptions in the 
iHOD formalism, 2) uncertainties in the theoretical prediction of 
halo statistics like the halo mass function and the halo bias func¬ 
tion, 3) our ansatz of the weak Mji-dependence of the detection 


|2007[|Wang et al.|2013[|Hearin et al.|2014 1 . Additionally, the halo 
assembly bias becomes only significant for halos below the charac¬ 
teristic non-linear mass scale, where our error budget is dominated 
by statistical errors and cosmic variance. However, it is worth not¬ 
ing that halo assembly bias may have a much greater impact on 
the HOD modelling of colour-selected galaxy samples, and could 
dominate the error budget on all mass scales for both colours jZent-| 
|ner et al.|2014| ). 

The cause for the theoretical uncertainties in predictions of the 
halo mass and bias functions is three-fold: i) the Universe may have 
a non-ACDM cosmology (for halo statistics beyond ACDM, see 


Bhattacharya et al. 

201 1| Cui et al.|2012a| Ichiki & Takada|2012[ 

Murray et al.|2013 

[Zhang et al.|2013| LoVerde|2014[l or a differ- 


ent set of cosmological parameters (especially ag; see |Planck Col- | 
[laboration et al.|2015| for details on the tension between different 
probes) than the particular ACDM cosmology we adopt; ii) the pre¬ 
dictions should be inaccurate due to the unaccounted-for baryonic 
effects (at 5 to 20 per cent level depending on the mass scale and 
the feedback models, see |Cui et al.|2012bl|Cusworth et al.|2014t 
IVelliscig et al.|20T4| l; and iii) the predictions may be poorly cali¬ 
brated on the high mass end where the halos are rare (at the 5 per 
cent level for Mh~W^^h~^M q at 2=0, see Crocce et al.|2010 


[Watson et al. 120131 [Bocquet et al.|201^ . Among the three types 
of errors, the calibration errors affect few galaxies as the proba¬ 
bility of any observed galaxies sitting in a halo with mass above 
Mq is extremely low, and are thus negligible in our anal¬ 
ysis. Errors due to wrong cosmology, ignored baryonic effects, or 
the combination of both, change the halo mass and bias functions 
together in a coherent manner that can be roughly mimicked by 
changing Dm and/or erg within the ACDM model. In particular, 
changing Dm at fixed erg shifts the predictions along the halo mass 
axis almost uniformly (albeit with a minor change in the functional 
shapes due to the change in the power spectrum), while changing 
erg at fixed Qm modifies the amplitude of the predictions more pro- 
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gressively at higher halo mass (see figure 7 in |Zu et al.|2014[ for 
a pedagogical illustration). We explore the impact of i) and ii) by 
perturbing Q.m and ag using both theoretical arguments and mock 
tests below. 

In our analysis on large scales, we are effectively using 
the halo bias vs. halo abundance relation compressed from the 
halo mass and bias functions, i.e., the bias of M^-thresholded 
halo samples b{n) as a function of the co-moving number den¬ 
sity of that sample, n{>Mh)- Due to the exponential decline 
of the abundance of massive halos, the theoretical uncertain¬ 
ties in this b{n)-n{>Mh) relation are only important at the 
high mass end, where the galaxy sample is primarily com¬ 
posed of massive BCGs with small satellite fraction. In the 
context of the peak background-split theory of biased clus¬ 
tering (|Kaiser| |Cole & Kaiser] [T^ |Mo et al.|[T^ 

|Sheth & Tormen||1999 ', for very rare, very highly biased peaks, 
b{n) oc (T^^, yielding Wp(n) oc b^{n)^rnm oc b^(n)al ~ constant, 
and AS oc tlmb{n)^rnm oc QmO's, respectively. These two approx¬ 
imations hold even in the presence of baryons. Therefore, the clus¬ 
tering is barely affected by the changes in Qrn or erg, while the 
lensing is linearly responsive to the change in the product of 
and (jg. Since the galaxy clustering measurements have a better 
overall S/N than the g-g lensing measurements, a 10% change 
in Dmcrg should produce a shift in the SHMR smaller than 0.06 
dex in Mh (if all of our constraints came from large scales). To 
confirm this, we re-run the analysis using two different combina¬ 
tions of (Dm, cr 8 ) = (0.26,0.80) and (0.28,0.80). For the first case 
where we change erg alone, the shift in the derived SHMR is indis¬ 
cernible, whereas for the latter case where we increase the product 
of Dm and ag by 12%, the derived SHMR shifts to the lower halo 
mass by ~0.08 dex, corresponding to an increase in the stellar mass 
at h~^M q by only ~0.03 dex, with little change in the 

parameters that control the satellite HOD. 

The remaining sources of theoretical systematic errors, includ¬ 
ing the uncertainties in the mean halo mass-concentration relation 
Cdm{Mh) (we also ignore the dispersion in the halo concentration 
at fixed Mh), calibration in the scale-dependence of halo bias ( (r), 
and the deviation of the cross-correlation coefficient r^dr) from 
unity, are expected to be much smaller than the four main errors dis¬ 
cussed above. Summing all the systematic errors in quadrature, we 
would find the total systematic error comparable to or even dom¬ 
inate over the statistical uncertainties in the measurements. How¬ 
ever, as pointed out in |Coupon et al.| ( [2Q15^ , each of these system¬ 
atic errors affects the inference independently with different stellar 
mass and scale dependencies and we fit all of them jointly, the com¬ 
bined impact of these systematic errors should be sub-dominant 
compared to the statistical errors. 


7 STELLAR MASS CONTENT WITHIN HALOS 

In this section, we explore the implications of our inferred HOD 
parameters for the connection between stellar and halo mass. 


7.1 Stellar Mass Lunctions 

One of the most important consistency-checks for our fiducial 
iHOD model is to predict the parent SMFs at each redshift, and 
compare to the observed versions as well as the intrinsic SMFs 
estimated empirically, e.g., from the Vjnax method. As mentioned 
in Section [5T| this check echoes the philosophy of the abundance 


matching technique, but in practice is carried out in reverse. In par¬ 
ticular, the model uses the observed 2 -point correlation function to 
infer the mapping from the stellar mass content, including the cen¬ 
tral and satellite galaxies, to the dark matter halos, and the mapping 
in turn gives an exact prediction (via Equation |17[l for the parent 
SMF at each redshift — a “correlation” matching as opposed to 
“abundance” matching. 

Figure presents the results of this consistency check 
on SMFs at six different redshifts. For each redshift, the up¬ 
per sub-panel compares the observed (brown histogram), Knax- 
estimated (thin black line), and our inferred parent (thick black line 
with 1-(T uncertainty band) SMFs. The parent SMF, predicted from 
the best-fit iHOD model at that redshift, is also decomposed into 
the central (red dashed) and satellite (blue dot-dashed) contribu¬ 
tions, except for the z = 0.04 panel where we instead show another 
measured SMF from |Li & White|f2009^ (shifted by 0.1 dex to re¬ 
move the average bias between the two stellar mass estimators; see 
figure A1 in their paper). The central galaxies dominate the satel¬ 
lite galaxies in numbers across all stellar masses. The lower sub¬ 
panel shows the ratios of the observed (brown line with Poisson er¬ 
rors) and estimated SMFs (thin black line) over the parent one, i.e., 
the observed fraction fobs- The redshift evolution of the predicted 
parent SMFs is solely from the cosmic evolution of the halo mass 
function, and the |Pan t er et al.|j2007) SMF is the same in all pan¬ 
els. Although the stellar mass estimates from |Panter et al.|p007| > 
were independently derived from the SDSS DR3 spectroscopy, the 
SMF (also used for the abundance matching in MIO) provides a 
good fit to the observed SMF using the MPA/JHU stellar masses, 
indicating little systematic offset between the two estimators. Over¬ 
all, at M*>5 X Mq where the completeness of SDSS is 

expected to be high, the predicted parent SMFs show remarkable 
consistency with the observed SMFs in both the shape and the am¬ 
plitude at all redshifts. As mentioned in Section [5T| the IHOD anal¬ 
ysis is completely agnostic to the total number density of the galaxy 
catalogue (i.e. the normalisation of the observed SMFs). Therefore, 
the agreement seen in Figure|^demonstrates that, in order to match 
the clustering and lensing signal measured in SDSS, the placement 
of galaxies within the dark matter halos is uniquely determined so 
that the expected abundance of the galaxies at each stellar mass, 
translated from the halo mass function predicted by the ACDM 
Universe, automatically agrees with the observed galaxy SMFs in 
SDSS. 

For the two lowest redshifts, the observed SMFs at the high 
mass end are known to be notably incomplete due to two photomet¬ 
ric confusions: 1 ) to avoid saturation and excessive cross-talk in the 
spectrographs, some objects were rejected because they either have 
saturated centres (mostly bright active nuclei) or are blended with 
a saturated star; and 2 ) the image deblending software sometimes 
over-deblended bright galaxies with large angular extent (Strauss| 
|et al.|[2002l ). A somewhat related issue is the over-subtraction of 
the sky background mentioned in Section [2)2l which results in sup¬ 
pressed flux estimates in bright galaxies. Therefore, limited to the 
2<0.05 galaxies, the SMF measured by Li & White has a substan¬ 
tially lower amplitude at M*>3 x than the Panter et 

al’s SMF, the observed SMFs beyond 2;>0.08, and the parent SMF 
predicted by our best-fit model. 

We next consider the /obs inferred from the ratio of the Knax- 
estimated SMF to the predicted parent SMF in the lower sub-panels 
of Fig. 1^ Starting with the low mass end, /obs is always above 60%, 
and reaches 100 % at above 6 x Mq until 2 x lO^^h“^M 0 , 

beyond which point the Umax method is expected to underestimate 
the galaxy abundance due to the aforementioned photometric is- 


© 0000 RAS, MNRAS 000, 000-000 























22 Zu & Mandelbaum 


10 ' 

" 10 ° 

I 

10-^ 

I 

o 

Oh 

^ 10'^ 
CO 

>e< 10 '° 

10''' 
1.5 
^ 1.0 
^ 0.5 
0.0 
10^ 

" 10 ° 

I 

10-^ 

I 

o 

a 

§ 10-2 
CO 

•e 10‘° 


10 


-4 


1.5 

1.0 

0.5 

0.0 


' ' "1-'—'—.1-'—'—.1-'—^ 

! - z=0.04 ! 

- Panter07 

- - - LW09 - 

observed 

' I" 1-'—'—...'—'—...'—^ 

! - z=0.08 ! 

- Panter07 

r — central - 

- ■ - ■ satellite 

observed - 

■ ■" 1—■—■—.1—■—'—.1—■—^ 

! - z=0.12 ! 

- Panter07 

r — central - 

- ■ - ■ satellite 

observed , 





''-- -' 



.”. 



^ - z-0.16 ^ 

^ - z-0.20 ^ 

■ - z-0.24 i 

- Panter07 

- Panter07 

- Panter07 

r — central - 

r — central - 

r — central - 

- ■ - ■ satellite 

- ■ - ■ satellite 

- ■ - ■ satellite 

observed - 

- observed - 

0 observed - 




. ^ 

r 

r ■ 




; . ▲ 

. A 




' 


__ 

__ 

■ ■ ■ ■ 1 .1 ■ —^ ■ 1 ■ ■ 

■ ■ ■ ■ 1 .1 . — 1 ■ ■ 

■ ■ --I .1 .1 ■ ■ 


10° 10^° 10” 10° 10^° 10” 10° 10^° io" 

M, [Mq//i^ ] M, [Mg/Zi^ ] M, [Mq//i^ ] 


Figure 9. Compaiison between the observed (brown histograms) and the parent (thick solid curves with 68% confidence range shown in gray bands) galaxy 
stellar mass functions predicted from the best-fit iHOD model at six different redshifts. In each upper sub-panel, the thin solid curve is the estimated SMF of 
SDSS DR3 galaxies from |Panter et al.H2007) . For the z=0.04 panel, we show another SDSS SMF estimated by |Li & White|p009^ from the 2<0.05 galaxies 
as the thin dashed curve (after correction for the offset between the two stellar mass estimates); for the rest of the panels the predict parent SMF is decomposed 
into the contributions from the central and satellite galaxies, indicated by the red dashed and blue dot dashed lines, respectively. The lower sub-panels show 
the ratios of the |Panter et al.|j2007t (thin black) and the observed galaxy number density (brown) to the predicted SMF from our model. The redshift evolution 
of the predicted SMF in the model is coming solely from the cosmic evolution of the halo mass function. 


sues. Meanwhile at the high mass end, the IHOD model predicts a 
much better match to the observed SMFs than the Panter et al’s esti¬ 
mate does, reaching a constant f^hs of ~ 90%, especially at the two 
highest redshifts where the photometric confusions that plague the 
lower redshifts are absent and the sample of very massive galaxies 
should be highly complete. 

At the mass scale around the /obs inferred from 

the T4iax-estimated SMF exhibits a dip at 60%-70% level in the 
lower sub-panels of Figure]^ There are two main possible sources 
that could contribute to this dip, one theoretical and one observa¬ 
tional. On the one hand, the dip could be a manifestation of the 
systematic uncertainty in the theoretical model. As seen from Fig- 
ure[T^, this mass range marks the transition of galaxies from a pre¬ 
dominantly blue, star-forming population below to a red, quiescent 
population above on the M^-z diagram. Given the strong correla¬ 
tion between galaxy colour and M,/L, we expect that the ansatz 


we have made, i.e., assuming a weak dependence of /obs(Af*| 2 ) 
on the host halo mass, to be the most fragile at this transitional 
mass scale. For example, if the M,/L at fixed stellar mass is 
skewed higher in more massive halos ( [Taylor et al.|2015l l, the ob¬ 
served galaxies would preferentially reside in host 

halos less massive than does a typical 10^^h~^M q galaxy in the 
Universe, and thus exhibit weaker clustering and g-g lensing sig¬ 
nals than their stellar mass suggests. As a result of this masquer¬ 
ade effect, the iHOD model would place them inside smaller ha¬ 
los, resulting the over-prediction of their abundances. It is worth 
noting that this theoretical systematic uncertainty is neither absent 
from the traditional HOD modelling, nor would it be eliminated 
by adding the SMF data as an extra constraint. On the other hand, 
the photometric issues that affect the targeting of bright galaxies 
would also impact this intermediate mass range, rendering the star¬ 
forming galaxy population underrepresented in the MGS at the low 
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redshift. The selection effect is easily discernible in Figure [T^: at 
h~^M q, the galaxies at redshifts lower than 0.04 have a 
redder average g—r colour than those at higher redshifts. Since the 
majority of the detected galaxies with M*~lO^°/i“^M 0 comes 
from 2 < 0 . 08 , they carry heavy weight (large values of 1 /Knax) in 
the Panter et aTs and Li & White’s SMFs. Therefore, the under¬ 
representation of these galaxies at low redshift translates to poten¬ 
tially significant underestimate of their abundances. In reality it is 
more likely that both the theoretical and observational sources con¬ 
tribute to the dip at some level, and the true parent SMF should lie 
somewhere between the Panter et aTs and our predicted curves. 

7.2 Centrals: Stellar Mass to Halo Mass Relation 

The most crucial piece for mapping the stellar content back to dark 
matter halos is the SHMR of the central galaxies. The top panel 
of Figure [^compares the SHMR and its logarithmic scatter de¬ 
rived from our fiducial iHOD constraints (black solid line with gray 
band) to that inferred from several methods, including abundance 
matching. To highlight the shrinkage in scatter at high halo mass 
predicted by our fiducial constraint, we also show two curves that 
delineate the boundaries of the constant scatter band with 77 = 0. 
To avoid clutter, here we only show the scatter for the SHMR from 
our constraint. 

Let us first compare to the constraint from |Moster et al.| 
( | 2010 ^ (red dashed), which employed the same cosmology as our 
analysis and matches the halo abundance to the SDSS SMF mea¬ 
sured by |Panter et al.|p007^ , albeit with a different functional form 
for the SHMR and a fixed scatter. The two mean SHMRs show re¬ 
markable agreement over Ig Mh ~ [12 - 14], or M* ~ 2 x 10^°- 
2 X lO^^fi“^M0, where the data are the most constraining and the 
completeness is the highest. Above this mass range, the Moster et 
aTs relation predicts higher stellar mass for central galaxies than 
our mean SHMR for a given halo mass, i.e., a steeper SHMR. 
Considering the lower amplitude of the Vjnax-estimated SMF em¬ 
ployed by the |Moster et al.|p010[ > analysis for abundance matching, 
naively one might expect that the observed massive galaxies have to 
be pushed to reside in progressively rarer and more massive halos, 
making their SHMR shallower instead. However, due to the expo¬ 
nential decline of the halo mass function in this regime, the SHMR 
is much more sensitive to the change in scatter than the small 
change in the detection rate. This discrepancy on the high mass end 
is mainly caused by the smaller scatter assumed in the |Moster et al.| 
( |2010^ analysis (frinM, = 0.345, or 0.15 in dex) compared to that 
inferred by our model (crinM, ~ 0 . 40 at Mh=lO^®/i"^M 0 ). Their 
smaller scatter allows much fewer low-mass halos to be considered 
as hosts for the massive galaxies, thus requiring a steeper slope in 
order to match the galaxy abundance. 

The |Behroozi et aL| ( |2010^ analysis derives the constraints on 
the SHMR by abundance matching to the SDSS SMF measured 
by |Li & White] ( |2009) at 2 < 0.05. They carefully accounted for 
the effect of scatter in the SHMR by varying both the scatters in 
the true stellar mass at fixed halo mass (intrinsic scatter) and in the 
observed stellar estimate at fixed true stellar mass (measurement 
scatter). For the intrinsic scatter, a log-normal prior centred at 0.16 
dex with a width of 0.04 dex is placed (i.e., 0.37 ± 0.09 in natural 
log), and for the measurement scatter a fixed value of 0.07 dex is 
applied for the SDSS data. After abundance matching, the posterior 
constraint on the intrinsic scatter is 0.151 q q 2 dsx, consistent with 
their input prior. However, as clearly shown by the top left panel of 
Figure]^ the |Li & White| ( |200^ SMF is underestimated at the high 
mass end, thus shifting the mean SHMR horizontally toward higher 






Figure 10. Top panel: Comparison between the SHMRs inferred from 
abundance-matching based methods and our best-fit iHOD model. The 
gray band shows the logarithmic scatter predicted by the best-fit model, 
while the two enveloping thin curves represent the range of constant scat¬ 
ter with 7?=0. Among the four external constraints used for comparison, 
the Moster et al. (2010) and the Behroozi et al. (2010) results are based 
on abundance-matching, while the other two are derived from fitting the 
clustering, lensing, and SMF jointly. Bottom panel: The total stellar mass to 
halo mass ratio as a function of halo mass. Coloured layers show the contri¬ 
bution from different stellar mass bins while separate portions of the same 
coloured layer indicate the relative contribution from centrals and satellites 
within each stellar mass bin. The horizontal band represents the cosmic 
baryon to dai'k matter ratio in the Universe. Note that the y-axis is logarith¬ 
mic above 0.05. 

halo mass. [Behroozi et aL| ( |2010l t also adopts a slightly higher erg 
than MIO and our study, which predicts more massive halos and 
further pushes the slope of the mean SHMR shallower than our 
constraint. 

Unlike abundance matching, our fiducial constraint is able to 
break the degeneracy between the scatter and the slope of the mean 
SHMR on the high mass end without assuming any external priors. 
The leverage comes from the unique constraining power of com¬ 
bining clustering and lensing. On large scales, the galaxy cluster¬ 
ing signal is proportional to while the g-g lensing scales 

with ^mhg^mm- On small scales, Wp is less a probe of the SHMR 
than AE, which directly measures the average halo mass of indi¬ 
vidual galaxy samples. For a fixed cosmology, simultaneous fit to 
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Wp and AE effectively constrains the clustering bias as a function 
of halo mass, for which ACDM predicts a relatively steep slope on 
the high mass end (see the seventh and eighth columns of Table[TJ. 
However, as shown by the open contours in the 5-ain m, panel in 
Figure |7] the cHOD constraint stills shows substantial degeneracy 
between the slope parameter S and the scatter, due to the inadequate 
S/N in the measurements of Wp and AE. Thanks to the 84% more 
galaxies used in the iHOD analysis, our fiducial analysis derives 
much tighter constraint with minimum residual degeneracy (filled 
contours). 

The L12 result shown in Figure[T^was derived from a slightly 
higher redshift range (0.22<2<0.48) than the MGS used by us. 
Since we have adopted a similar model framework and parameter¬ 
ization proposed in L12, the analysis of |Leauthaud et aL| ( |2012b| 
is very similar to the cHOD analysis in this paper. However, their 
analysis adds the measured SMF as part of the constraining data set 
beside angular clustering and g-g lensing. In this regard, although 
allowing the scatter to vary freely, the constraint in |Leauthaud et al.] 
( |2012b^ is somewhat more related to the AM-based methods, be¬ 
cause the smaller uncertainties in SMF easily dominate the likeli¬ 
hood of the fit. A straightforward comparison with the mean SHMR 
of |Leauthaud et aL]j2012b] l is complicated by the differences in 
the redshift range of the galaxy samples and the stellar mass es¬ 
timators (although the COSMOS SMF below 2=0.48 agrees very 
well with that from |Panter et al.|[2007[ see figure 5 of L12), but 
the two constraints appear to be qualitatively consistent with each 
other above the characteristic halo mass. The |Coupon et al.|j2015^ 
constraint is very similar to L12, derived from a joint lensing, clus¬ 
tering, and abundance analysis, but for galaxies at a much higher 
redshift (2~0.8) and more massive than Thus, the 

deviation between the |Coupon et al.| l |2015) SHMR and ours at the 
low mass end can be attributed to the evolution of galaxy popula¬ 
tion from 2=0.8 to 2=0.1. At the high end however, we do not 
see the steepening of the |Coupon et al.| ( |2015| l SHMR at 2~0.8, as 
hinted by the L12 constraints from 2~0.88 (see figure 11 in L12). 
The L12 and the |Coupon et al.| ( |2015| l analyses also constrained the 
scatter in the SHMR, yielding (Tin Ar»—0.474 (i.e., 0.206 dex) and 
cinM,—0.506 (i.e., 0.22 dex; at respectively, 

in excellent agreement with our constraint of O.SOIq'qj. This level 
of agreement is not necessarily expected, given that the scatter in¬ 
cludes both intrinsic scatter and measurement uncertainty, the latter 
of which could differ for the three different samples studied in these 
papers. 

Below the characteristic halo mass, our mean SHMR has a 
slightly higher amplitude than all the other three SHMRs, implying 
higher stellar-to-halo mass ratios for the low stellar mass galaxies. 
This difference echoes the discrepancy seen in the comparison be¬ 
tween the parent and measured SMFs at the intermediate and low 
stellar mass scales. However, all those curves are still consistent 
with one another within the statistical uncertainties in the inferred 
mean SHMRs below Mq (compare the difference 

to, e.g., the uncertainty band in the left panel of Figure]^. 

The efficiency of stellar mass assembly at the centre of halos, 
characterised by the central stellar-to-halo mass ratio 
rises sharply as a function of halo mass at the low mass scale, 
probably due to the increasing difficulty for the stellar feedback 
to drive cold gas out of the steepening gravitational potential wells 
of halos jHopkins et al.|[2012| ). The growth of the central galax¬ 
ies shifts down into a much lower gear beyond the characteristic 
mass Mq, decreasing from its peak ~0.04 

at to below 0.001 at h~^M q. AGN 

feedback is believed to be one of the major culprits that quench 


the star formation in those central galaxies within high mass halos. 
However, as shown by Figure]^ the satellite “cloud” also begins 
to emerge within halos above Mq, and a good indica¬ 

tor for the efficiency of stellar mass assembly has to account for 
the stellar mass of satellites. To this end, the bottom panel of Fig¬ 
ure [represents a more complete picture by showing the total stel¬ 
lar mass assembly efficiency, Ml°^/Mh (assuming /i=0.72), inte¬ 
grated over all the predicted parent galaxies above We 

also divide the total efficiency into contributions from six different 
stellar mass bins marked in the legend. Within the layer of each stel¬ 
lar mass bin, we further split the contribution into two components, 
the central (shaded colour) and the satellite (tinted colour), sepa¬ 
rated by a dashed line. For comparison, the horizontal line on top 
indicates the universal baryon-to-matter mass ratio. As expected, 
the total efficiency is dominated by the central galaxies until its 
peak at around h~^ Mq, then it levels off and asymptote 

to a constant value ~0.015 for galaxy groups and clusters, where 
the satellite contribution dominates. This asymptotic value of the 
total stellar mass fraction is consistent with observational studies 
by stacking galaxy catalogues or images inside samples of opti- 
cal an d/or X-ray clusters ( [Budzynski et al.|2014[|Bahcall & Kulier| 
2014 1 . Galaxies with M* around dominate the total 

stellar mass content within halos of mass above 10^^h~^M q. 

The near-constant total stellar-to-halo mass ratio in group and 
cluster-size halos is very intriguing. By comparing the integrated 
stellar mass to the stacked weak lensing signal at different radii 
within the MaxBCG clusters, Bahcall & Kulier (2014) found that 
the total mass of clusters can be largely accounted for by the to¬ 
tal dark matter mass associated with all the subhalos that host the 
satellite galaxies. Therefore, when the fractional contribution from 
the central galaxy is negligible. 
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where (dWaat/dlgM*)|Mh is the satellite stellar mass function 
conditioned at halo mass Mh, and is the subhalo mass. If the 
satellite stellar-to-subhalo mass relation M* /M^'° were to roughly 
follow the SHMR of the centrals, as assumed by our subhalo lens¬ 
ing model and the SHAM methods, the aggregate of the satellite 
galaxy population inside each halo is likely to be a strong function 
of the halo mass when dN^at/dig M, differs from halo to halo. In 
order for any two halos with different masses to share the same total 
stellar-to-halo mass ratio, the stellar mass functions within the two 
halos have to be somewhat self-similar — all dNa^t /d Ig M* have 
the same shape near the knee of the SMF where most of the stellar 
mass is stored, with their amplitudes Aaat linearly proportional to 
the halo mass. To test this hypothesis, we explore the conditional 
SMFs of satellites below in SectionjTs] 

The SHMR, specifying the mean logarithmic stellar mass of 
central galaxies at fixed halo mass, is a convenient choice from 
a theoretical modelling point of view, but observationally what is 
often measured is the opposite, i.e., the mean log-halo mass or halo 
mass at given stellar mass. For instance, as the predecessor of the 
study in this paper, [Mandelbaum et al.|p006| l measured the average 
log-halo masses for a set of early and late-type galaxy stellar mass 
samples in the SDSS, via the traditional HOD modelling of their g- 
g lensing signals. To compare to the result from |Mandelbaum et al.| 
l |2006^ and other similar studies in the literature, we compute the 
distribution of the halo mass at fixed stellar mass as 




p(Mr") 


(47) 
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Figure 11. Comparison between the halo-to-stellar mass relations infeiTed from the best-fit iHOD model and that from various other weak lensing measure¬ 
ments. Top panel: the dash black curve is just the inverse function of the best-fit SHMR, i.e., the same black curve in the top panel of Figure |10| but with x- and 
y-axis swapped. The thick black curve with a gray band indicate the mean and scatter (not statistical uncertainty on the mean) of the distribution of logarithmic 
halo mass at fixed stellar mass, while the thin black curve shows the mean halo mass at fixed stellar mass. The magenta dashed and cyan dot-dashed curves 
are the results from L12 using COSMOS and |Hudson et al.H2015) using CFHTLenS, while the blue circles are from the maximum-likelihood weak lensing 
analysis of GAMA galaxies. The green and the red squares are the average weak lensing masses measured for CFHTLenS and SDSS galaxies, respectively, 
and within each survey the relations are constrained for red/early-type and blue/late-type galaxies separately using nearly identical HOD prescriptions. All the 
uncertainties shown for the weak lensing mass measurements are \-cr. Bottom panel: Similar to the top panel, but showing the ratio of each quantity over the 
prediction by the best-fit iHOD model. 


where is the SHMR of central galaxies (similar to 

Equation [21} and p{Mh) is the halo mass function normalised by 
the total number density of halos, while p{M^^^) is proportional to 
the SMF of central galaxies, calculated from 

f + oo 

p{MT'^)= p{MT"\Mh)p{MH)AMH. (48) 
Jo 

The halo-to-stellar mass relation, p(Mf, computed from the 

best-fit SHMR and its scatter shown in the top panel of Figure [T0| 
via Equations |47} and ( |48^ , is represented in Figure [m in two 
forms, the mean log-halo mass at fixed stellar mass (thick black 
solid curve), 

{lnM,.|M.)= J p{Mh\Mr")\nMhdMh, (49) 

and the mean halo mass at fixed stellar mass (thin black solid 
curve), 

{Mh\M,) = j p(M;,|Mr")MhdMh, (50) 

respectively. The difference between the two means are small at the 


low stellar mass end but grows to 0.1 dex at high masses, and the 
gray band indicates the l-cr logarithmic scatter about the mean rela¬ 
tion. As seen in Figure [T^ the original SHMR flattens on the high 
mass end, so the log-scatter in halo mass at fixed stellar mass is 
much larger than on the low stellar mass end. The l-cr error on the 
mean relation (not shown here, but see Figure]^ is comparable to 
the scatter at the lower stellar mass end, but much smaller than the 
scatter above M*~lO^°/i“^M 0 . Also shown on Figure [TT] is the 
black dashed curve from Equation using the best-fit param¬ 
eters, i.e., the same thick black curve shown in Figure [T0| except 
that the x- and y-axises are swapped. The difference between the 
black dashed and the two black solid curves can easily exceed 0.5 
dex for high stellar mass samples, and reaches almost 1 dex for 
galaxies with M*>5 x Mq. Therefore, caution should be 

exercised when trying to compare the theoretical SHMR with the 
direct observational estimates of halo masses for any galaxy sam¬ 
ples selected above = The magenta dashed and 

the cyan dot-dashed curves in Figure[^are the halo-to-stellar mass 
relations inferred from the L12 (COSMOS) and the |Hudson et al.| 
(|2015^ (CFHTLenS) analyses, respectively, showing great consis- 
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Figure 12. Left panel. Satellite stellar mass function dNsat / dM^ conditioned at six different halo masses marked on the top right. The shaded regions indicate 
the 68% uncertainties derived from the iHOD constraint. The inset panel shows the same set of curves, with each amplitude re-scaled by the corresponding 
halo mass, highlighting the departure of the conditional SMF from the Schechter functional form in high mass halos. Right panel. Similar to the left panel, 
but for M=,= dA^sat/c^-AT*, the fractional contribution to the total stellar-to-halo mass ratio per dex in M*. The satellite galaxies with stellar mass around a few 
times 10^®contribute the most to the total stellar mass content in each halo. 


tency between independent constraints from different surveys. The 
circles and the squares indicate the average halo masses inferred 
from individual galaxy stellar mass samples using the weak lens- 
ing measurements in GAMA ( |Han etal.|2Q15^ and CFHTLenS i Ve 


[lander et al.|2Q14) , respectively. The lenses used in the |Han et al. 
(201^ analysis were central galaxies of groups and clusters that 
are almost volume-limited, while for the latter two, the [Velander] 
|et al.|p0l4l > constraint was separately derived for colour-segregated 
samples that are flux-limited. Despite the differences in the meth¬ 
ods and data sets, our relation is largely consistent with both sets 
of measurements. The triangles are the measurements from the ear¬ 
lier SDSS DR4 galaxies jMandelbaum et al.|2006^ , also split into 
sub-samples of early- and late-type galaxies. Our analysis is a di¬ 
rect update from the [Mandelbaum et al.| ( |20()6l l study, using SDSS 
DR7 (vs. DR4) galaxies, updated MPA/JHU stellar mass catalogue, 
improved photometric redshifts for the source sample, additional 
measurements (galaxy clustering), and a more sophisticated HOD 
model, and the two constraints are fully consistent with each other. 

Finally, to facilitate the comparison between our constraint 
and observational studies that measure the mean (log)halo mass of 
individual galaxy samples, we provide an analytic fitting formula to 
the best-fit halo-to-stellar mass relation shown in Figure [^(com¬ 
puted from the fiducial iHOD MCMC chain via Equation]^, 


(Ig I Mr") =4.41 [1-f exp (-1.82 (IgMr"- 11.18))]-^ 
+ 11.12 sin (-0.12 (IgMf" - 23.37)), (51) 

where Mh and Mr" are in units of hT^M^ and h~^MQ, respec¬ 
tively. The above fitting formula is accurate to within 0.15% across 
the entire stellar mass range above 3 x Mq. 


7.3 Satellites: Conditional Stellar Mass Function 

Instead of imposing some fixed functional form (e.g., the Schechter 
function) for the conditional stellar mass function (CSMF), we 
compute the satellite CSMFs as the derivatives of the satellite 
HDDs at fixed halo mass. This model flexibility allows for a more 
thorough exploration of the shapes of the satellite CSMFs, espe¬ 
cially any potential mass-dependent deviations from the Schechter 
function at both the low and high stellar mass ends. The left panel of 
Figurefl^shows the satellite CSMFs inferred for halos of six differ¬ 
ent masses marked on the legend, with each shaded band indicating 
the 68% uncertainty from the iHOD constraint. For low mass ha¬ 
los, the form of the CSMFs resemble the Schechter function with 
a flattening low M* portion and a sharp exponential cutoff at high 
M*, but it starts to exhibit an extra power-law portion at intermedi¬ 
ate M* for halos more massive than Mq. The inset panel 

shows the six CSMFs scaled by their corresponding halo mass. All 
the scaled CSMFs would land on top of one another if the satel¬ 
lite population inside halos follows an exact homology sequence. 
Since asat is strongly constrained to be around unity (asat = 
1.0001°:“) , the number of satellite galaxies above any stellar mass 
threshold scales linearly with halo mass above Meat (see Equa¬ 
tion l|22^), leading to a self-similar behaviour across the low M* 
range h~^M q). At high stellar mass range, however, the 

homology is broken and an excess population of galaxies of suc¬ 
cessively higher stellar mass begin to emerge as satellites — they 
used to be the central galaxies of small halos that were later ac¬ 
creted by their current-day, more massive host halos — a physi¬ 
cal picture from the hierarchical structure formation paradigm of 
the ACDM Universe, recovered by the prediction of our fiducial 
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model, a purely statistical framework that describes the clustering 
and lensing of galaxies. 

To understand the asymptotic behaviour of the total stellar- 
to-halo mass ratio mentioned in Section |7!2l we show the stellar 
mass-weighted CSMFs, M*dA^aat/rflg M*, in the right panel of 
Figure [T^ The peaks in those curves confirm the finding from the 
bottom panel of Figure |l0| that galaxies of M*~2 x Mq 

contribute the most amount of stellar mass per dex in M* to each 
halo. Again, in the inset panel we scale each quantity by the cor¬ 
responding halo mass, showing the fractional contribution to the 
total stellar-to-halo mass ratio per dex in M* for each halo mass. 
The asymptotic behaviour of the total stellar mass fraction can 
be mainly attributed to two factors: 1) the self-similarity of satel¬ 
lites galaxies below Mq ensures that the total ratio in¬ 
tegrated up to is the same for all halos above 

5 X Mq\ and 2) the satellite galaxies above Mq 

contribute negligibly to the total ratio, so any increase in halo mass 
brings little change to the total stellar-to-halo mass ratio. 


8 CONCLUSIONS 

We have developed a novel extension to the HOD framework — 
namely the iHOD model — to solve for the mapping between the 
observed stellar mass distribution and the underlying dark matter 
halos, via modelling the projected galaxy auto-correlation function 
Wp and the g-g lensing AS signals. In particular, the iHOD model 
has two main features: 

• It is able to predict uip and AS for all the galaxies above 
the mixture limit (i.e., the minimum stellar mass for the quies¬ 
cent galaxies to be detected above the flux limit). This flexibility 
allows us to include ~84% more galaxies than the traditional HOD 
method, which substantially improves the S/N of the Wp and AS 
measurements (by 10% to almost a factor of two depending on stel¬ 
lar mass). 

• It takes into account the volume incompleteness of galaxy 
stellar mass samples in a statistically consistent way, eliminat¬ 
ing the need to assume completeness or parameterize the redshift- 
dependent selection functions. 

The crucial input to the iHOD model is the shape of the observed 
SMF at each redshift, which is employed by the iHOD model 
to construct an HOD for the galaxies at that redshift slice us¬ 
ing p(M*, Mh), the 2D joint probability density distribution of a 
galaxy with stellar mass M* sitting in a halo of mass Mh- The 
clustering and lensing signals of any galaxy sample can then be 
calculated by combining the signals predicted from the HODs of 
the individual redshift slices. 

The 2D distribution Mh) has two components, the 

mean and scatter of the SHMR for the central galaxies and the 
global HOD for the satellite galaxies. We adopt a similar param¬ 
eterization for the two components as Lll, but allow the power- 
law slope of the satellite HOD (a) and the concentration ratio be¬ 
tween the satellite and dark matter profiles (/c) to vary freely during 
the constraint. Furthermore, we also re-calibrate the subhalo lens¬ 
ing model in our analysis against the state-of-the-art hydrodynamic 
simulation MassiveBlack-II. Thanks to the greatly improved S/N 
in the Wp and AE measurements, our fiducial iHOD analysis not 
only breaks the degeneracy between the slope and the scatter of 
the SHMR, therefore placing stringent constraint on the link be¬ 
tween the dark halos and their central galaxies, but also derives the 
conditional SMF of satellite galaxies as a function of halo mass. 


The inferred SHMR is in good agreement with constraints from 
the abundance matching methods and the joint clustering, lensing, 
and galaxy abundance analyses using the Lll framework. For the 
SMF of satellite galaxies at fixed halo mass, the best-fit model pre¬ 
dicts a departure from the Schechter function in massive groups 
and clusters, probably a result of mergers and accretion that con¬ 
vert central galaxies of small halos into satellites of more massive 
systems. It will be interesting to compare the conditional SMFs of 
satellites statistically derived from the iHOD model, which allows 
the satellites to have a different SHMR than the central galaxies, 
to that inferred from abundance matching, which directly assigns 
satellite stellar masses to subhalos in the simulations assuming the 
same SHMR as the central galaxies. 

In principle, the iHOD model uses only the shape of the ob¬ 
served galaxy SMF as input and is agnostic of the SMF amplitude 

— using half of the galaxies randomly drawn from the original 
catalogue would yield the same best-fit model parameters, albeit 
with larger uncertainties. However, the parent SMF predicted from 
the best-fit iHOD model agrees remarkably well with the observed 
SMF in both the shape and the amplitude. Since the amplitude of 
the predicted SMF is translated from the normalisation of the halo 
mass function predicted by the ACDM cosmology, this agreement 
is highly non-trivial, demonstrating the efficacy of the iHOD model 

— in order to match the clustering and lensing signal measured in 
SDSS, the mapping between the galaxy stellar content and the dark 
matter halos is uniquely determined so that the expected galaxy 
abundance at each M*, calculated by summing galaxies at that M* 
over all the dark matter halos, automatically gives the observed 
galaxy SMF in SDSS. 

In future work, we will extend the iHOD model to provide 
constraints on the quenching mechanisms that transform galaxies 
from star-forming to quiescent. The most straightforward approach 
is to apply the same analysis to just the quenched population (i.e., 
the galaxies with r>0.8 above the mixture limit), and compare 
the inferred pred(A/«, Mh) to the best-fit p(M*, Mh) in this paper 
to predict the quenched fraction as a function of M, and Mh (see 
e.g., [Zehavi et al.||2005] [Zehavi et al.||201l] for HOD studies of 
galaxies segregated by colour). Alternatively, one can do a joint fit 
for the red and blue galaxies simultaneously, assuming an ad hoc 
functional form for the quenched fraction (see, e.g., [Tinker e t al.| 
|2013[ [Rodriguez-Puebla et al.|[2bl5| l. However, as mentioned in 
Section 6.3 the galaxy assembly bias needs to be carefully treated 


or marginalised over to obtain a robust constraint when modelling 
colour-selected samples. 

Going beyond the goal of understanding galaxy formation, 
HOD modelling of galaxy populations is likely to have a role in 
cosmological analyses of future large imaging surveys that seek to 
constrain dark energy using weak lensing, such as the Large Synop¬ 


tic Survey Telescope (LSST;|LSST Science Collaborations & LSST 
Project|2009^, Euclid ( [Laureijs et ^2011) , and WFIRST (Spergel 


et 


al.|2015|irWhile cosmological weak lensing was originally iden¬ 


tified as a very clean probe of dark energy due to its sensitivity 
to matter fluctuations, more recent work has identified two major 
theoretical uncertainties: the effect of baryons on the matter power 
spectrum, and the intrinsic alignments of galaxy shapes that vio¬ 
late the assumption that any coherent galaxy alignments are due to 
weak lensing. Leading proposals for the mitigation of these theoret¬ 
ical uncertainties include halo modelling of the galaxy-shear cross¬ 
correlation (galaxy-galaxy lensing) in combination with galaxy 
clustering and cosmic shear, instead of using cosmic shear alone 


(e.g., Schneider & Bridle|2010[[Semboloni et al.|20TT]|2013[|Zent-| 

|ner et al.|2013^ . Thus, we anticipate that the iHOD model will be 
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an important contribution to such efforts, given that it will enable 
the use of a larger galaxy sample in understanding these impor¬ 
tant contaminants to cosmic shear surveys, and thus enable the use 
of tighter priors on the nuisance parameters for baryonic effects 
and/or intrinsic alignments when extracting cosmological informa¬ 
tion from the cosmic shear signal. 

As a general yet powerful statistical formalism, the iHOD 
model can be easily applied to galaxies at higher redshifts, such 
as the existing datasets from the COSMOS survey jScoville et al.| 
|200^ used by LI2, CFHTLenS ( [Heymans et al.|2012| ), the recently 
finished Baryon Oscillation Spectroscopic Survey (BOSS; |Eisen-| 
|stein et al.|201 l[[bawson et al.|2013| > and its near-term higher red- 
shift successor eBOSS, and the deeper surveys planned for future 
facilities such as the Dark Energy Spectroscopic Instrument (DESI; 
Levi et aL||2013^, the Subaru Prime Eocus Spectrograph (|Takada| 


etal.|2014 1 , Euclid, and WEIRST. Galaxies at higher redshifts, such 
as luminous red galaxies (LRGs;|Eisenstein et al.|2001 Anderson] 
|et al.|20f2) and emission line galaxies ( [Comparat et al. 2015^ , are 
usually targeted with some complicated colour and flux cuts, so 
the galaxy samples are always volume-incomplete. Eor instance, 
[Hoshino et al.| ( |20r5} found that the average number of LRG-type 
central galaxies in the massive halos does not asymptotically reach 
unity, contradictory to the assumption in traditional HOD mod¬ 
els ( [Parejko et al.|20l'3) . Therefore, the iHOD model would be an 
especially valuable tool in constraining the link between galaxies 
and dark matter halos at high redshifts. 
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APPENDIX A: SMALL-SCALE LENSING CUTOFFS 


In this appendix, we describe our choice of minimum scale for 
modelling of the lensing signals. Our estimator for AS, Eq. <01, 
is designed such that our signal includes a correction for the zero 
shear contributed by “source” galaxies that are at the lens redshift. 
This correction is often written separately as a “boost factor” (e.g., 
[Sheldon et al. |2004{ [Mandelbaum et al.|2005| l. 


B{r^) 


T.is'^idrp) 

Ers Wrsirp)’ 


(Al) 


where the summed weights are over lens-source and random lens- 
source pairs. B{rp) = 1 if there are no galaxies associated with 
the lens that are included as sources and if lensing magnification is 
negligible. Generally, since both clustering and magnification are 
large at small r^, B{rp) is a monotonically decreasing function. 

As shown in [Mandelbaum et al.[ ( [2005T l, small-scale systemat- 
ics due to deblending and sky subtraction can appear as an increase 
in B{rp) on the smallest scales, followed by the expected decrease 
on scales where those systematics no longer operate. In the decreas¬ 
ing region, we know that source detection is no longer 100% effec¬ 
tive, but what we do not know is the actual level of inefficiency, 
since B{rp) gives a single constraint on two completely gener¬ 
ate quantities (amount of contamination by physically-associated 
galaxies, and inefficiency in the detection of both real sources 
and physically-associated ones). To properly correct for physically- 
associated sources, we would need to know both of these quantities. 
We would also have to know how the shear estimates and photo -2 
estimates of the sources in those regions may have been modified 
due to the systematic that is causing difficulties with source detec¬ 
tion. Due to the lack of sufficient information to model the signal 
on scales where B{rp) indicates small-scale systematics, we do not 
plot or attempt to model AE there. The quoted minimum scales 
used for the lensing modelling comes from this consideration. 

There are two additional systematics that could, in principle, 
operate on small scales above our minimum cutoff. The first sys¬ 
tematic is the intrinsic alignment of galaxy shapes, pointing coher¬ 
ently in the radial direction with respect to the lens. While intrinsic 
alignments of bright red galaxies are well-established in the liter¬ 
ature (e.g., most recently, I Singh et al.|2014[ ), attempts to measure 
any radial alignments of faint galaxies in mixed blue and red source 
populations with respect to the positions of nearby bright galaxies 
have thus far only resulted in null detections ( [Blazek et al.|201^ 
with relatively tight upper limits. Hence we do not consider this to 
be an important systematic for this work. 

Lensing magnification has also been considered as a system¬ 
atic since, if present, it means that our normalization by Era '^rs to 
correct for physically-associated sources is incorrect. As discussed 
in [Mandelbaum et al.[ ( [2005T l and [Schmidt et al.|P009[ l, the amount 
of magnification for a flux- and size-selected sample depends on 
both the slope of the number counts near the flux limit and the 
slope of the apparent size distribution near its limit. Moreover, it 
depends on the slopes of these distributions weighted by whatever 
per-object weights are used for the lensing analysis. |Simet & Man- [ 
[delbaum|j2015^ calculated the appropriate slopes for the source cat¬ 
alogue used in this paper, and based on the numbers in their table 
1, the ratio of observed number counts with magnification to the 
un-magnified number counts is very near 1 (riobs/n « 1 — 0.03k). 
Hence, even though the best-fitting model for our eight stellar mass 
samples have convergences that range from 0.0014 to 0.063 at 
0.1/i“^Mpc, the maximum effect of magnification on the observed 
counts and therefore the size of the systematic error in the boost 


© 0000 RAS, MNRAS 000, 000-000 



















Linking Stellar to Dark Matter 31 


factor and lensing calibration at that scale is —0.2 per cent. We 
therefore neglect this source of error. 
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